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Success in failure 


A failed crop trial of genetically modified wheat still provides crucial lessons for those 
battling to provide the planet’s growing population with a sustainable food supply. 


said, is written by the winner. The history of research is no 
different. 

But failure in science is vital. Another cliché about history is equally 
applicable to scientific flops: people who are ignorant of them are 
doomed to repeat them. Which brings us to a green — and to some, 
an unpleasant — field in England. 

In 2012, a team based at Rothamsted Research, an agricultural- 
science institute a short train ride north of London, planted wheat 
that they had genetically modified to emit a chemical used by aphids 
as a warning that they are under attack. The researchers wanted to see 
whether this would give the crops a way of repelling the damaging 
pests. They thought that the chemical might also attract insect para- 
sites alerted to the promised presence of aphids. 

Before they got the chance, the crops attracted a swarm of protest- 
ers. Opponents of genetic modification (GM) technology mounted an 
imaginative, if sometimes bizarre, campaign against the trial, complete 
with dubious scientific claims, loaves of bread adorned with cartoon 
cow heads, and videos promising to “Take the Flour Back’ com- 
plete with rock-music soundtrack. The research itself cost £732,000 
(US$1.2 million) over five years. Securing the site from those who 
threatened to tear it up cost nearly £1.8 million. 

The idea behind what has, rather unfortunately, become known 
as ‘whiffy wheat’ showed promise in the laboratory. Yet in field 
trials the crop is an unquestionable failure. A paper published on 
25 June in the journal Scientific Reports notes that the GM crops 
“showed no reduction in aphids or increase in parasitism” compared 
with controls (T. J. A. Bruce et al. Sci. Rep. http://doi.org/5sr; 2015). 

This is disappointing on many levels. First, because of the 
effort — and money — that has gone into the concept. Second, because 
GM crops will surely have a major role in providing a future sustaina- 
ble food supply. As Earth’s population grows, so does its appetite. Work 
aimed at increasing crop yields, by both GM and non-GM methods, is 
among the most crucial research being conducted on the planet. So 
hostility towards GM research — one reason why it is rare for such 
crop trials to reach field-scale studies in Europe — is still among the 
most important societal issues for science to address. 

Some opponents of GM crops have reacted with predictable claims: 
that the trial was a waste of money, that investment in GM science 
should therefore be cut off, and that this one set-back means the entire 
concept is flawed. Hardly. 

As with most negative results in research, things can still be learnt 
from this trial. The team might yet modify the way their crop emits the 
alarm pheromone and may experiment in areas with higher densities 
of parasites. 

The crop failed, but so did the protests. The research was done; a 
useful result was obtained. Ironically, had the protests succeeded and 
the trial been abandoned, the protesters would be unable to crow about 


iE is rare for failures to be lauded in science. History, as it is often 


the crop’s failure. GM research continues at Rothamsted, as it does 
around the world. Some of it will work and some will not. 

Those who wish to make an argument against GM crops face major 
problems. The rise of new techniques such as CRISPR means that 
what is and is not a GM organism is an increasingly grey area, both 
scientifically and for regulators. 

And these crops, with all the controversy that comes with them, 
are no longer the sole preserve of huge agri-businesses. The use of 
GM technology is increasingly being passed to the people who really 
need it — those in developing countries who are trying to improve the 

agriculture of their nations. 


“Considering Considering all GM crops as a single case 
all GM crops as is increasingly problematic. Consumer- 
asingle case friendly traits, such as apples that do not turn 
is increasingly brown, now vie with nutritional enhance- 


ment for developing nations and drought 
resistance. Small academic groups around 
the world are producing locally tailored varieties alongside the engi- 
neered staples that major companies sell in huge quantities to farmers 
in the developed world. And the debate is no longer limited to crops 
— on page 13, we report on GM pigs that could soon make their way 
into the human food chain. 

All who care about evidence-based policy-making should thank 
those who continue to struggle against both the difficulties of doing 
science and the added difficulties caused by people who would 
see science abandoned. We will all need the fruits — and the 
cereals — of their labours. m 


problematic.” 


Gene politics 


US lawmakers are asserting their place in the 
human genetic-modification debate. 


human embryo — an inevitable development to anyone paying 

attention to biotechnological advances — has sparked the big- 

gest bioethical debate of the year and one that will last for decades. The 

overwhelming consensus is that such embryos should not be brought 

to term in clinical settings — at least not for now. The debate over 

when, if ever, that should take place has played out in the scientific lit- 

erature in duelling articles, arguments about the technology’s efficacy 
and calls for an Asilomar-like conference on bioethics. 

So it is little surprise that lawmakers are weighing in. On 16 June, 

a subcommittee of the US House Committee on Space, Science and 


To revelation in April that scientists had edited the genome ofa 
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Technology held a hearing on human gene editing with witnesses who 
included Jennifer Doudna, a biochemist at the University of California, 
Berkeley, who was one of the inventors of the genome-editing system 
CRISPR, and the Institute of Medicine (IOM) president Victor Dzau. 

The climate was more educational than controversial, with law- 
makers asking the usual questions about the risks, benefits and ethics 
of engineering future generations of the human race. Parallels were 
drawn with another ongoing debate over ‘three-parent embryos, in 
which an egg cell’s diseased mitochondria are replaced with healthy 
mitochondria from a second woman. A decision on whether to allow 
that procedure in the United States is in the hands of the US Food and 
Drug Administration (FDA), which has commissioned an IOM report 
on the topic that is due this winter. 

While the research and technology subcommittee grilled the 
experts, a separate subcommittee — of the House Appropriations 
Committee that funds the FDA — was meeting elsewhere on Capitol 
Hill to draft the agency’s 2016 budget. The subcommittee wants to take 
no chances with human modification: a bill that it released on 17 June 
bans the FDA from using public funds to evaluate applications for 
clinical trials involving genetically modified human embryos. Ironi- 
cally, the current wording could backfire: applications for permission 
to investigate new drugs are automatically approved in 30 days unless 
the FDA blocks them, which would require funds. 

If the budget passes, this clause would be the first time that 
lawmakers have used the FDA to limit human embryo research. A 
1996 law known as the Dickey-Wicker Amendment bans the use 
of federal funds to create human embryos for research, but does not 
pertain to FDA regulation. The National Institutes of Health (NIH) 
reaffirmed in April that heritable genetic modification falls under the 
Dickey-Wicker rule, and director Francis Collins said that clinical 
application of such technology is “viewed almost universally as a line 
that should not be crossed”. 

Nevertheless, Congress is determined to havea say. Deeply embed- 
ded ina report accompanying the appropriations bill are orders from 
the funding committee that the FDA appoints “an independent panel 
of experts, including those from faith-based institutions with exper- 
tise on bioethics and faith-based medical associations” to evaluate 


the IOM three-parent embryo report when it is released and to report 
back. 

Although the FDA budget is far from becoming law — after under- 
going another round of editing, it must still be passed by Congress and 
the Senate and signed by the president — the implication is clear. The 
powerful spending committee that holds the purse strings wants to be 
involved in the debate: an understandable and indeed necessary posi- 

tion. Still, even allowing for political postur- 


“This clause ing, the apparent pre-emptive distrust of the 
would be the IOM’s highly respected peer-review process is 
first time that alarming. The perennially underfunded FDA 
lawmakers has already spent US$1.17 million on the IOM 
have used the committee, and although no budget is set aside 
FDA to limit for the new panel, it will soak up money that 
human embryo could be spent elsewhere. Also worrisome is 


the religious language, which harks back to 
2010 when a court ruled in favour of religious 
organizations interpretation of Dickey-Wicker and briefly shut down 
all NIH-funded human embryonic stem-cell research. 

As this journal has said, all voices, including those of faith-based 
groups, should be heard in the debate over human-genome editing; 
indeed, the input of highly influential religious groups is essential to 
make a decision on how and if to regulate, especially in the United 
States. But the IOM committee already includes a professor of religious 
studies — so why duplicate the effort? This mandate to the FDA is not 
one that should come from a secular government, which seems to be 
seeking to impress conservative supporters. As one ethicist put it: “It 
is a sign that the culture wars aren't dead.” 

When it comes to human-genome editing, however, those wars are 
a reality that all must face — and that is a good thing. This opening 
salvo from Congress shows just how complex the coming debate over 
human genomic modification will be. Academics have spent the past 
months debating among themselves how to proceed with research 
and clinical applications, sometimes acting as though they will be 
the arbiters of the final decision. As public awareness of the technol- 
ogy increases, that ethical discussion will rightly be taken out of their 
hands alone and planted firmly in those of broader society. m 


research.” 


Light detective 


Smartphone camera set to come to the aid of 
sleuths, scientists and wine lovers. 


to solve a murder is the time of death. The hero detective, typi- 

cally, is frustrated by the vague responses of the forensics team: 
“sometime between Tuesday night and Thursday morning” does little 
to narrow down the range of suspects. Scientists have long tried to help. 
And forensic science, with DNA analysis at the forefront, now ensures 
that more real-life criminals can expect a knock on the door, some- 
times decades after they thought they had evaded detection. However, 
it remains effectively impossible to accurately judge the age of a blood- 
stain. Corpse excluded, bloodstains are typically the most common 
piece of evidence encountered at a homicidal-crime scene. 

Colour could be the key. After blood leaves the body it starts to dry; 
as it does so, it changes from red to brown. Back in 1907, the Italian 
researcher Louis Tomellini produced a chart of 12 bloody spots, to 
illustrate this colour change over a year. As forensic science developed 
through the twentieth century, so did bloodstain analysis. By the 1960s, 
researchers were using photospectrometry, recording reflectance 
spectra and working out how the rate of Tomellini’s colour changes 
could be affected by different atmospheric conditions. These are useful 


A s any reader of detective fiction knows, a crucial clue needed 
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observations, and forensic analysis of the colour of bloodstains is today 
a common part of the forensics team’s work. But the results are still too 
variable for the analysis to stand up in court. 

Colour provides more-useful data than many might think. 
Spectrometry is a valuable technique in many areas, from drug dis- 
covery to environmental monitoring. And astronomers use spectro- 
meters to probe the atmospheres of distant exoplanets for conditions 
that might support life. Spectrometers, in other words, have become 
indispensable instruments. But they tend to be expensive, complex 
machines. The most precise can also be bulky, making them diffi- 
cult to use in the field. On page 67 of this issue, scientists describe a 
possible step forward. They have built an optical spectrometer that is 
both small and powerful, and potentially cheap enough to find use in 
consumer electronics — to detect corked wine perhaps. 

Like many modern images, those analysed by the scientists are taken 
with a smartphone camera. These are selfies from the quantum world: 
the camera is converted into a spectroscope using suspensions of parti- 
cles called colloidal quantum dots. Exposed to light, these tiny particles 
produce vivid colours, with the shade and hue determined by the parti- 
cle size. With the right mixture of particles, a coating can be applied that 
can filter and analyse the wavelengths (and so colour) of incoming light. 

The research is discussed in a News & Views article on page 39, 
which describes how it could be used to produce 
“ubiquitous sensing elements in household 
devices connected to the Internet” Beware 
would-be bloody criminals, your fridge is 
watching you. 


SD NATURE.COM 
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ow can science address the gender-inequality problem? It 
is a persistent issue that has been highlighted again by the 


controversy over the recent comments by Nobel laureate 
Tim Hunt about his “trouble with girls”? 

The problem in biomedical research was starkly demonstrated to me 
just before I became director of the Walter and Eliza Hall Institute of 
Medical Research in Melbourne, Australia, in 2009. I chaired my first 
meeting of the senior academic staff and, despite having had a high- 
profile female director — Suzanne Cory — for more thana decade, none 
of the 20 department heads or professors in the room were women. 

I pledged to improve the gender balance, and five years on, I think 
we have made some progress. We now have four female professors or 
department heads. That is hardly a reason for wild celebration, but 
given that we began from such a woeful base it 
is a start. 

So what have we done? Simply, we asked the 
people affected — women in their postdoctoral 
period — for their ideas. 

For our institute, some of the simplest changes 
included steps to ensure that all important meet- 
ings are held within school hours, to make sure 
that researchers with child-care duties can attend. 

We have also set up a dedicated office with 
hot-desks and an adjoining room in which 
small children can play and older children can 
do homework or watch television, under the 
supervision of their parents. 

And we designated a separate room to allow 
women to breastfeed their infants or to express 
milk. The idea of women expressing milk ina toi- 
let or a sick room — as was done before — seems 
as inappropriate as having a researcher making their coffee there. 

What else? We demand that at least half of speakers at all confer- 
ences and workshops organized by the institute are women. And we 
created a gender-equality committee, with men and women, to moni- 
tor implementation of policies, gather data on progress and challenge 
us with new ideas. 

That was the easy stuff. Some steps required more thought, major 
investment and time. The trend over the past 30 years of postponing 
scientific independence by having researchers work for longer as post- 
docs is generally problematic, but especially difficult for many women, 
because those career-defining years overlap with child-bearing years. 
Female postdocs are placed in an invidious position: take some time 
off and have your productivity drop to near-zero for a period, or post- 
pone having children in the hope of obtaining a 


faculty position. > NATURE.COM 
So we deliberately started to appoint faculty _ Discuss this article 
members at a younger age, in their early to _ onlineat: 


mid-thirties, perhaps after a 2-4 year periodas —_go.nature.com/dr3vom 


WE DEMAND THAT 
AT LEAST 


HALF OF 
SPEAKERS 


AT CONFERENCES 
ORGANIZED BY THE 
INSTITUTE 


ARE WOMEN. 


Practical policies can 
combat gender inequality 


Mechanisms to help researchers to balance work and home lives have made a 
positive difference to the gender balance at my institute, says Douglas Hilton. 


postdocs. This provides women with resources they can use (postdocs, 
research assistants and students of their own) should they take time out 
from full-time work to have children and to care for them. 

For women who have children during their postdocs, we offer tech- 
nical support, paid for by the institute to make sure that their projects 
progress while they are on maternity leave. 

We introduced a 5-year, Aus$1.25-million (US$960,000) fellowship 
to support a female laboratory head, who can spend the money as she 
wishes. It can pay for salaries, for instance, or for consumable expenses. 
And, given that the high cost of child care can prevent women from 
returning to work, the institute helps to pay for it — up to Aus$15,000 
each year for female postdocs and lab heads with pre-school-age chil- 
dren. Yes, men pay for child care too, but we have a surfeit of male lab 
heads, and we cannot afford to do it for everyone. 

We also pay for our female scientists to take 
children and a carer with them to academic 
conferences, both here and abroad. This can 
cost hundreds or sometimes a few thousand dol- 
lars, but we think that presenting at meetings is 
important for career development. We also pay 
for a ‘family roony at local conferences to allow 
researchers to listen to talks while accompanied 
by their children — which is good for both men 
and women. 

We want to do more. We are planning an on-site 
child-care centre and new fellowships to support 
women returning after extended leave. And we are 
considering making the lab-head role more flex- 
ible. Could it be done as a job-share, for example, 
with two faculty members splitting supervisorial 
responsibilities, each working three days a week? 

We know that these steps have made a difference. Some are expen- 
sive, but the ‘my-institute-has-no-money’ argument is rarely a good 
excuse for inaction. Every institution has some discretionary money and 
can choose to spend it in these ways rather than, say, on over-generous 
recruitment packages for well-established (usually male) scientists. 

Bigger changes have occurred when we have spoken openly, 
passionately and sometimes bluntly about our situation and the chal- 
lenges faced by women more broadly in Australian academia. The 
Australian Academy of Science has become a leader in gender-equality 
discussions. The Australian Academy of Technological Sciences 
and Engineering has undergone a similar cultural change. And the 
National Health and Medical Research Council has issued guidelines 
and minimum standards on gender equality to institutions that wish 
to receive funding. This is progress. = 


Douglas Hilton is director of the Walter and Eliza Hall Institute of 
Medical Research in Melbourne, Australia. 
e-mail: hilton@wehi.edu.au 
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RESEARCH HIGHLIGHTS 


Roadkill yields 
panther numbers 


By counting the number of 
endangered panthers hit 

and killed by cars in Florida, 
researchers have estimated 

the population size of this rare 
cat. They say that it is the first 
statistically robust population 
estimate for the animals across 
their breeding range. 

Brett McClintock of the 
National Marine Mammal 
Laboratory in Seattle, 
Washington, and his team 
used data on reported 
collisions with Puma concolor 
coryi (fewer than 20 per 
year)and traffic densities, 
as well as information from 
a small number of radio- 
collared panthers to estimate 
the total population across 
the state. They show that 
panther numbers seem to be 
slowly increasing, but may 
never have exceeded 150 
individuals between 2000 and 
2012. 

This method could be 
applicable to other rare 
animals, the researchers say. 
J. Appl. Ecol. http://doi.org/5sg 
(2015) 


Corals inherit 
love for heat 


Heat tolerance in corals can be 
passed down the generations, 
suggesting that corals can adapt 
as the climate warms. 


Selections from the 
scientific literature 


Air pollution triggers floods 


at a level similar to that before China’s 
economic boom, the team found that the rate 
of rainfall in the storm would have been up 

to 60% lower than under current emissions 
levels. Aerosols trapped in the basin warm the 
air and suppress convection, allowing excess 
moisture to build up and condense into rain as 
it rises up over the mountains. 

The authors suggest that future severe floods 
in the region could be mitigated by reducing 
air pollution, particularly black carbon. 
Geophys. Res. Lett. http://doi.org/5q9 (2015) 


A catastrophic 2013 flood in China was 
probably caused, in part, by air pollution. 

In July 2013, heavy rainfall resulted in a 
devastating flood in the mountains northwest 
of the Sichuan Basin in China (pictured). The 
basin has seen increasing industrial activity 
in the past few decades. Jiwen Fan at the 
Pacific Northwest National Laboratory in 
Richland, Washington, and her team modelled 
the region’s atmospheric processes during 
the storm using different levels of aerosol 
emissions. By setting the modelled emissions 


Researchers have suggested. 
that corals physiologically 
acclimatize to higher 
temperatures rather than 
inherit heat tolerance. To 
test this idea, Line Bay at 
the Australian Institute of 
Marine Science in Townsville, 
Australia, Mikhail Matz at the 
University of Texas at Austin 
and their team bred corals 
(Acropora millepora; pictured) 
from two locations in Australia 
separated by 5° of latitude. 
Offspring produced by parents 
from the warmer area had an 
up to 10 times greater chance of 
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survival when exposed to heat, 
compared with the larvae 
of parents from the cooler 
region. Larvae generated 
by crossing corals from the 
warm area with those from 
the cool region inherited key 
genetic differences associated 
with heat tolerance. 

Corals that thrive in 
the heat could be moved 
to other latitudes so that 
they reproduce with local 
corals and introduce heat 
tolerant adaptations to the 
population. 
Science 348, 1460-1462 (2015) 
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Target neurons to 
relieve asthma 


Silencing signals from pain- 
sensing nerve cells in the lungs 
reduces the symptoms of 
asthma in mice. 

When stimulated by 
allergens, these neurons 
cause airways to constrict 
and trigger symptoms such 
as coughing and wheezing. 
Bruce Levy and Clifford Woolf 
of Harvard Medical School 
in Boston, Massachusetts, 
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and their colleagues blocked 
the activity of these cells and 
found that this reduced airway 
inflammation by reducing 

the production of immune- 
signalling molecules such as 
IL-5. The team reports that IL-5 
triggers pain-sensing neurons 
to release a peptide called VIP 
that stimulates immune cells, 
creating a feedback loop that 
sustains allergies. 

The results reveal a potential 
way to treat asthma and 
respiratory allergies. 

Neuron http://doi.org/5rf (2015) 


MATERIALS 


DNA glues 
particles together 


Researchers have assembled 
micrometre-sized particles into 
a variety of crystals using DNA 
as ‘glue. 

DNA has been used to 
control the assembly of 
DNA-coated nanoparticles, 
but doing this with larger 
particles leads to the formation 
of random clumps that do 
not crystallize. To solve this, 
Marcus Weck, David Pine 
and their colleagues at New 
York University attached 
many short DNA strands 
to the surface of polymer 
particles. The high density of 
DNA strands — 5 to 25 times 
higher than in previous work 
— along with their short ‘sticky’ 
ends and the smooth particle 
surface resulted in the particles 
self-assembling into various 
crystalline designs. 

The method could be used 
to make more complex struc- 
tures out of a range of materials 
including metals and semicon- 
ductors, the authors say. 

Nature Commun. 6, 7253 (2015) 


ASTRONOMY 


Bounty of dark 
galaxies found 


Astronomers have discovered 
more than 850 faint galaxies in 
a galaxy cluster that could be 
made mostly of dark matter. 
Using archived images 
from the Subaru Telescope in 
Hawaii, a team led by Jin Koda 
at Stony Brook University 


in New York searched for 
observations of the Coma 
galaxy cluster, which is 
roughly 101 million parsecs 
(330 million light years) away. 
The team found 854 ultra- 
diffuse galaxies, a class of faint 
galaxy that can be as large 
as the Milky Way, but which 
has only 0.1% the number 
of stars. For these galaxies to 
remain gravitationally bound 
together, the researchers show 
that more than 99% of their 
mass must be dark matter. 
This suggests that the 
crowded environment sucks 
gas away from these galaxies, 
leaving them largely unable to 
form stars. 
Astrophys. J. Lett. 807, L2 (2015) 


| ___NEUROSCIENCE 
Male mice process 
pain differently 


Male and female mice use 
different types of immune cell 
to process chronic pain. 
Studies of male mice have 
shown that immune cells 
called microglia in the spinal 
cord have an important role in 
chronic pain. To see whether 
this is the same in female mice, 
a team led by Jeffrey Mogil at 
McGill University in Montreal 
and Michael Salter at the 
University of Toronto, both 
in Canada, induced chronic 
pain in both sexes. The team 
then used drugs or antibodies 
to reduce microglia function. 
Whereas pain responses were 
reduced in the males, females 
were unaffected and instead 
recruited a different type of 
immune cell, called a T cell. 
This difference was linked to 
testosterone, which could make 
T cells less able to mediate pain 
in the males, leading to their 
use of microglia instead. 
Nature Neurosci. http://dx.doi. 
org/10.1038/nn.4053 (2015) 


‘Tatooines’ may 
be common 


Planets orbiting a binary 
star system — like Tatooine, 
the fictional home planet 

of Luke Skywalker in 
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SOCIAL SELECTIO 


Popular topics 
on social media 


Acall to fund people not proposals 


Laboratory heads today spend too much time struggling to 
win funding from the US National Institutes of Health (NIH), 
and this pressure to fund raise is driving young scientists 
away, according to a much-discussed commentary in Cell. To 
address this problem, Ronald Germain, chief of the laboratory 
of systems biology at the National Institute of Allergy and 
Infectious Diseases (NIAID) in Bethesda, Maryland, argues 
that the NIH should make funding decisions based almost 
entirely on researchers’ past accomplishments, and not 

on their future plans for specific projects. Irakli Loladze, 

a quantitative ecologist at the University of Maryland 


University College in Adelphi, tweeted 


cece, 


person-not-project”- 


based scheme can be game changer in how science is funded” 
Sally Rockey, director of the NIH Office of Extramural 
Research, says that the agency is already 


> NATURE.COM 
For more on 

popular papers: 
go.nature.com/kwvdu9 


Star Wars — could form with 
surprising ease. 

Most known circumbinary 
planets orbit close to their 
stars, where the competing 
gravitational forces from the 
two stars make the orbits 
of nearby objects unstable 
or intersect. This prevents 
debris from clumping 
together to form planets. 
But Benjamin Bromley of 
the University of Utah in 
Salt Lake City and Scott 
Kenyon of the Smithsonian 
Astrophysical Observatory in 
Cambridge, Massachusetts, 
show with simulations that 
a zone exists near the host 
stars where the orbits of 
debris wobble, but do not 
cross, allowing for planet 
formation. 

This suggests that Earth- 
sized “Tatooines’ could be 
common and that more are 
likely to be discovered soon. 
Astrophys. J. 806, 98 (2015) 


Farming footprint 
is rapidly growing 
Humans are venturing 

farther across the oceans and 


harvesting a greater proportion 
of the ocean’s biomass to feed 


taking steps to streamline the funding 
process and to support scientists despite 
an ever-tightening budget. 

Cell 161, 1485-1491 (2015) 


the world’s appetite for seafood. 
Reg Watson at the University 
of Tasmania in Taroona, 
Australia, and his colleagues 
analysed global fisheries, and 
seafood import and export 
data. They found that the 
minimum distance between 
where seafood is sourced 
and where it is consumed 
increased nearly sixfold from 
1950 to 2011. Humans are now 
exploiting nearly 40% of the 
oceans primary productivity, 
up from roughly 15% in 1950. 
The team predicts that the 
world’s growing demand for 
seafood will be met only until 
about 2050, unless changes are 
made in marine farming. 
Nature Commun. 6, 7365 (2015) 


> NATURE.COM 

For the latest research published by 
Nature visit: 
www.nature.com/latestresearch 
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SEVEN DAYS nescnsi 


EPA overruled 

On 29 June, the US Supreme 
Court struck down a regulation 
to limit mercury emissions 
from power plants. The 

court ruled 5-4 that the US 
Environmental Protection 
Agency (EPA) should have 
conducted a cost-benefit 
analysis when it decided to 
regulate mercury, instead of 
doing so later in the regulatory 
process. The EPA is likely 

to revise its regulation in 
accordance with the ruling. 
Most utility companies have 
already complied with the 

rule by installing emissions- 
reduction equipment or closing 
down old coal-fired plants. 


US-China talks 

The United States and 

China agreed to continue to 
cooperate on several measures, 
such as cracking down on 
illegal trafficking of wildlife 
and nuclear materials, and 
extending their five-year 
bilateral clean-energy research 
project. The seventh Strategic 
and Economic Dialogue, held 
on 23-24 June in Washington 
DC, included wide-ranging 
diplomatic discussions about 
climate, security and trade. 


Research conflict 
The Smithsonian Institution 
in Washington DC said on 

26 June that it plans to adopt 
stricter procedures governing 
conflicts of interest for its 
academics. The institution 
reviewed its policies in 
February, after allegations that 
solar physicist and climate- 
change sceptic Willie Soon 
of the Harvard-Smithsonian 
Center for Astrophysics in 
Cambridge, Massachusetts, 
failed to disclose his industry 
funding on his publications. 
The Smithsonian says that it 
will review sponsored awards 
for conflicts of interest and 
adopt standard terms and 


Seven lions from South Africa began their 
journey to Akagera national park in Rwanda last 
week. The big cats — donated by two parks in 
KwaZulu-Natal province, where there is a surplus 
of lions — were selected for their reproductive 


conditions for this. It will also 
automate its annual financial- 
disclosure programme for all 
researchers and adopt a policy 
that requires scientists to 
disclose all sources of funding 
when publishing. 


Alberta carbon tax 
The provincial government 

of Alberta, Canada, is to 
gradually double its carbon 
tax from Can$15 (US$12) per 
tonne to Can$30 per tonne by 
2017. Environment minister 
Shannon Phillips announced 
the plan on 25 June. She also 
appointed Andrew Leach, an 
environmental economist at 
the University of Alberta in 
Edmonton, to lead a panel that 
will review Alberta’s overall 
climate policy before the 
United Nations climate summit 
in Paris this December. 
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Lions restored to Rwanda after 15 years 


potential to repopulate Akagera. The last lion 
died in the Rwandan park 15 years ago, after the 
population was poisoned by cattle herders in the 
wake of the country’s 1994 genocide when the 
park was left unmanaged. 


Cancer campaign 
Oregon Health & Science 
University in Portland 
announced on 25 June that it 
has raised US$1 billion for its 
cancer centre in under 2 years. 
Philip Knight, co-founder 

of the sportswear company 
Nike, issued a challenge to 

the university in 2013: Knight 
would donate $500 million 
only if the centre raised the 
same amount in 2 years. The 
money is thought to be the 
largest successful ‘challenge 
pledge’ in the United States and 
will go to research into cancer 


detection. See page 14 for more. 


Controversial fund 
The European Council agreed 
on 25 June to create a European 
Fund for Strategic Investments 
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(EFS), which will support 
large projects — including 
research — that face financing 
bottlenecks. The EFSI concept 
alarmed European researchers 
when it was first announced 
by the European Commission 
in January because some of 

its budget will be taken from 
the commission's Horizon 
2020 research programme. 
The commission agreed in 
May to exclude some Horizon 
2020 programmes, such as the 
European Research Council 
and the Marie Sktodowska- 
Curie postdoc programme, 
from contributing. 


Climate group wins 
Ina surprise ruling, a Dutch 
court has ordered the 
Netherlands to take measures 


BRENDON CREMER/NIS/MINDEN/CORBIS 
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SOURCE: P. PIOT ET AL. LANCET HTTP://DOI.LORG/5SM (2015) 


to cut its greenhouse-gas 
emissions by at least 25% by 
2020, relative to 1990 levels. 
The court cited the undisputed 
state of climate science as 
evidence for its judgment, 
announced on 24 June. Under 
European Union pledges, the 
Netherlands needs to reduce 
its emissions by only 15%. 

But the Urgenda Foundation, 
a Dutch citizens’ climate- 
change platform, said that this 
target is too low and sued the 
government for failing to take 
adequate action to prevent 
citizens from possible harm. 
The Dutch government can 
appeal against the ruling. See 
page 18 for more. 


Water worries 

The water level of Lake Mead, 
the United States’ largest 
reservoir, dipped to a historic 
low of 328 metres on 23 June, 
according to the US Bureau 
of Reclamation. That is lower 
than at any time since 1937. 
The site, located in Nevada, 
is part of the Colorado River 
system, which supplies water 
to roughly 40 million people 
in seven US states. 


SpaceX explosion 

A SpaceX Falcon 9 rocket 
carrying supplies to the 
International Space Station 
(ISS) broke up (pictured) 
shortly after lift-off on 28 June 
from Cape Canaveral, Florida. 
The mission is the third supply 
run to the ISS to fail since 


TREND WATCH 


Global efforts to combat AIDS 
must be scaled up urgently, or 
the epidemic is likely to rebound, 


warns a report by the Joint 
United Nations Programme 


on HIV/AIDS and The Lancet 
Commission. The document 
says that the next 5 years are 
crucial to eliminating AIDS as 
a major public-health threat by 
2030 (P. Piot et al. Lancet http:// 
doi.org/5sm; 2015). The rate of 
new infections is not falling fast 
enough, and donations to fight 
the epidemic have levelled off, 
the report notes. 


late October. “Preliminary 
analysis suggests the vehicle 
experienced an overpressure 
event in the upper stage liquid 
oxygen tank approximately 
139 seconds into flight,” said 
SpaceX, a private company in 
Hawthorne, California. The 
next supply mission to the ISS 
is scheduled for 3 July, when 

a Russian Progress spacecraft 
will launch from Kazakhstan. 
See go.nature.com/wmrem5 
for more. 


DNA law tested 


AUS company is the first 

to face penalties under 

the Genetic Information 
Nondiscrimination Act, a 

US law that protects genetic 
privacy. Last week, a federal 
court in Georgia awarded 
US$2.25 million to Jack Lowe 
and Dennis Reynolds, whose 
employer, Atlas Logistics 
Group Retail Services in 
Atlanta, Georgia, tested their 
DNA in a bid to identify who 
had left faeces on its premises. 
Neither man was the devious 
defecator’ See go.nature.com/ 
wtsvn3 for more. 


AIDS FUNDS PLATEAU 


PEOPLE 


UCL stands ground 
The head of University College 
London (UCL) has resisted 
calls to reinstate Nobel- 
prizewinning biologist Tim 
Hunt, who resigned from 

an honorary professorship 
after a media furore over 
comments he made about 
working with women. Michael 
Arthur, provost of UCL, said 
in a statement on 26 June: “Sir 
Tim has apologised for his 
remarks, and in no way do 
they diminish his reputation 
as a scientist. However, they 
do contradict the basic values 
of UCL — even if meant to be 
taken lightly — and because of 
that I believe we were right to 
accept his resignation.” 


Patent tussle 


Drug giant Eli Lilly has won 
akey UK patent fight over 

its blockbuster cancer drug 
Alimta (pemetrexed), the 
company announced on 

25 June. Alimta’s patent expires 
at the end of 2015, but Lilly, 
based in Indianapolis, Indiana, 
holds additional patents on a 
regimen of vitamins that are 
given with the drug. Dublin- 
based Actavis (now known as 
Allergan) proposed marketing 
an alternative form of Alimta 
after 2015, but the UK Court 
of Appeal ruled that this would 
violate Lilly’s vitamin patents. 


International monies from donor governments to low- and 
middle-income countries to tackle HIV/AIDS have levelled off. 
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SEVEN DAYS | THIS WEEK | 


6-9 JULY 

Research on space 
planes, scramjets and 
other hypersonic 
vehicles will be discussed 
at the American Institute 
of Aeronautics and 
Astronautics conference 
in Glasgow, UK. 
go.nature.com/hsvavt 


7-9 JULY 
Commercial and 
academic researchers 
meet in Boston, 
Massachusetts, for the 
ISS R&D conference to 
discuss innovation on 
the International Space 
Station. Elon Musk of 
SpaceX is among the 
speakers. 
go.nature.com/dxkff2 


9-13 JULY 

The US Society for 
Developmental Biology 
holds its annual meeting 
in Snowbird, Utah. 
go.nature.com/iy5m29 


The decision could allow 

Lilly to maintain exclusivity 

in Britain and some other 
European countries until 2021. 


|} RESEARCH 
LIGO milestone 


The Laser Interferometer 
Gravitational- Wave 
Observatory (LIGO) has 
tripled the sensitivity of its 
detectors, members of the 
LIGO collaboration said at a 
meeting last week in Waterloo, 
Canada. The team estimates 
that the improvement, revealed 
last month by an ‘engineering 
run of the upgraded detectors, 
will give LIGO a one-in- 

three chance of detecting 
gravitational waves during 

an observation run this 
autumn. LIGO has detectors 
in Washington state and 
Louisiana. 


> NATURE.COM 
For daily news updates see: 
WWww.nature.com/news 
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These meaty pigs could become the first genetically engineered animals to be approved for human consumption. 


GENE EDITING 


Super-muscly pigs created 
by small genetic tweak 


Researchers hope the genetically engineered animals will speed past regulators. 


BY DAVID CYRANOSKI 


elgian Blue cattle are hulking animals 
B that provide unusually large amounts 
of prized, lean cuts of beef, the result of 
decades of selective breeding. Now, a team of 
scientists from South Korea and China says 
that it has created the porcine equivalent using 
a much faster method. 
These ‘double-muscled’ pigs are made 
by disrupting, or editing, a single gene — a 


change that is much less dramatic than those 
made in conventional genetic modification, 
in which genes from one species are trans- 
planted into another. Asa result, their crea- 
tors hope that regulators will take a lenient 
stance towards the pigs — and that the breed 
could be among the first genetically engi- 
neered animals to be approved for human 
consumption. 

Jin-Soo Kim, a molecular biologist at Seoul 
National University who is leading the work, 
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argues that his gene edits merely speed up a 
process that could, at least in principle, occur 
through a more natural route. “We could do 
this through breeding,” he says, “but then it 
would take decades.” 

No genetically engineered animal has been 
approved for human consumption anywhere 
in the world, owing to fears of negative envi- 
ronmental and health effects. Fast-growing 
transgenic Atlantic salmon have languished 
in regulatory limbo for 20 years with the 
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> US Food and Drug Administration (see 
Nature 497, 17-18; 2013). 

Kim and his colleagues are part of a grow- 
ing band of researchers who hope that gene 
editing, which can be used to disable — or 
knock out — a single gene, will avoid this. 
Reports of gene-editing applications in 
agriculture include the creation of hornless 
cattle. (Horns make the animals difficult 
to handle and are currently burned off ina 
painful procedure.) Researchers have also 
engineered pigs that are immune to African 
swine fever virus. 

Key to creating the double-muscled pigs 
is a mutation in the myostatin gene (MSTN). 
MSTN inhibits the growth of muscle cells, 
keeping muscle size in check. But in some 
cattle, dogs and humans, MSTN is disrupted 
and the muscle cells proliferate, creating an 
abnormal bulk of muscle fibres. 

To introduce this mutation in pigs, Kim 
used a gene-editing technology called a 
TALEN, which consists of a DNA-cutting 
enzyme attached to a DNA-binding pro- 
tein. The protein guides the cutting 
enzyme to a specific gene inside 
cells, in this case in MSTN, which 
it then cuts. The cell’s natu- 
ral repair system stitches the 
DNA back together, but some 
base pairs are often deletedor 
added in the process, render- 
ing the gene dysfunctional. 

The team edited pig fetal 
cells. After selecting one 
edited cell in which TALEN 
had knocked out both copies 
of the MSTN gene, Kim's col- 
laborator Xi-jun Yin, an animal- 
cloning researcher at Yanbian 
University in Yanji, China, trans- 
ferred it to an egg cell, and created 
32 cloned piglets. 


Kim and his team have not yet published 
their results. However, photographs of the 
pigs “show the typical phenotype” of double- 
muscled animals, says Heiner Niemann, a 
pioneer in the use of gene-editing tools in 
pigs who is at the Friedrich Loeffler Institute 
in Neustadt, Germany. In particular, he notes, 
they have the pronounced rear muscles that are 
typical of such animals. 

Yin says that preliminary investigations, 
show that the pigs provide many of the 
double-muscled cow’s benefits — such as 
leaner meat and a higher yield of meat per 
animal. However, they also share some of its 
problems. Birthing difficulties result from the 
piglets large size, for instance. And only 13 of 
the 32 lived to 8 months old. Of these, two are 
still alive, says Yin, and only one is considered 
healthy. 

Rather than trying to create meat from 
such pigs, Kim and Yin plan to use them to 
supply sperm that would be sold to farmers 
for breeding with normal pigs. The resulting 


Belgian Blue cattle produce prized lean beef. 


offspring, with one disrupted MSTN gene and 
one normal one, would be healthier, albeit less 
muscly, they say; the team is now doing the 
same experiment with another, newer gene- 
editing technology called CRISPR/Cas9. Last 
September, researchers reported using a dif- 
ferent method of gene editing to develop new 
breeds of double-muscled cows and double- 
muscled sheep (C. Proudfoot et al. Transg. Res. 
24, 147-153; 2015). 

Because gene editing is a relatively new phe- 
nomenon, countries have only just started to 
consider how to regulate it in agricultural 
plants and animals. There are some signs 
that government agencies will view it more 
leniently than they do conventional forms of 
genetic modification: regulators in the United 
States and Germany have already declared 
that a few gene-edited crops fall outside of 
their purview because no new DNA has been 
incorporated into the genome. But Tetsuya 
Ishii, who studies international biotechnology 
regulation at the Hokkaido University in Sap- 

poro, Japan, and who has done 
an international com- 
parison on GM regu- 
lations, says that gene 
editing will raise 
increasing alarm 
as it progresses in 
animals. 
Kim hopes to 
market the edited pig sperm to 
farmers in China, where demand 
for pork is on the rise. The regulatory 
climate there may favour his plan. China 
is investing heavily in gene editing and his- 
torically has a lax regulatory system, says 
Ishii. Regulators will be cautious, he says, 
but some might exempt genetic engineer- 
ing that does not involve gene transfer from 
strict regulations. “I think China will go 
first,” says Kim. = 
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| FUNDING | 
How an Oregon cancer institute 


raised a billion dollars 


Gains from two-year fund-raising frenzy will aid the early detection of tumours. 


BY HEIDI LEDFORD 


( "i researcher Brian Druker had 
no idea that a fund-raising gala 
would change his life. On 20 Septem- 

ber 2013, armed with a speech that his wife 


had written for him, he waited patiently to be 
introduced by Philip Knight, the billionaire 
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co-founder of sportswear brand Nike. 
Knight was a friend and benefactor; a 
few years earlier, he and his wife Penny had 
donated US$100 million to the cancer cen- 
tre that Druker directs at Oregon Health 
& Science University (OHSU) in Portland. 
But nothing had prepared Druker for what 
happened next. “Penny and I will donate 
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$500 million to OHSU, if it is matched in 
pledges within two years in a fund-raising 
campaign, Knight said, drawing gasps of 
surprise from the audience. “If the campaign 
raises $499 million, we are relieved of our 
pledge,” he added. Druker turned in shock 
to his wife. “What do I do now?,” he asked. 
So began a frantic two-year scramble at the 


OHSU FOUNDATION 


KIYOSHI TAKAHASE 
SEGUNDO/ALAMY 


OHSU Knight Cancer Institute to boost its 
fund-raising — about $10 million in a good 
year — to $250 million annually. On 25 June, 
OHSU announced that it had reached its tar- 
get in 22 months. It is the largest amount a US 
institution has ever raised to win a challenge 
grant, according to the Indiana University 
Lilly School of Philanthropy in Indianapolis. 

“Publicly we were always very confident, 
because if you aren't, people aren't going to 
donate,” Druker says. “But when we first got 
started, we thought, ‘How are we going to do 
this?” 

Billion-dollar campaigns are still relatively 
rare, says Bruce Flessner, a fund-raising con- 
sultant at Bentz Whaley Flessner in Minne- 
apolis, Minnesota. And when universities do 
set out to raise that much, he notes, they typi- 
cally take about seven years and dedicate the 
proceeds to all corners of the institution. The 
Knight Cancer Challenge aimed to fund a 
single institute at a university that is far from 
the clusters of wealth found in New York City 
or Silicon Valley. 

“Portland is a great city, but it’s not minting 
billionaires at a fast rate,” says Flessner. “If 
there is a wealthy person in Oregon who 
hasn't been asked to make a gift to that cancer 
programme, I would be shocked.” 


LOCAL APPEAL 
But OHSU does have Druker, a renowned 
physician and researcher who made his name 
by laying the groundwork for the revolu- 
tionary leukaemia drug Gleevec (imatinib). 
The drug was approved by US regulators in 
2001, and turned chronic myeloid leukaemia 
(CML) — once a death sentence for 70% of 
people diagnosed with it — into a long-term, 
manageable disease for 90% of patients. 
Druker’s star power and Knight’s 
showmanship in designing and announc- 
ing the challenge galvanized the grass-roots 
fund-raisers. The campaign received more 
than 10,000 donations, given from 5 coun- 
tries and every US state. Three-quarters of 
the money came from sources in Oregon. 
The largest single donation was $100 million 
from Gert Boyle, chair of the Oregon-based 
company Columbia Sportswear. Boyle's late 
sister, a molecular biologist, died of brain 
cancer and was a scientific mentor to Druker 
when he was an undergraduate at the Univer- 
sity of California, San Diego. 
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Brian Druker spearheaded a campaign that raised US$500 million in under 2 years. 


The campaign decided early on to 
approach Oregon’s state legislature for 
$200 million to construct two buildings 
for the cancer institute. OHSU pitched the 
expanded cancer centre to legislators as a 
way to create jobs for the state while fight- 
ing a disease that is the number-one killer of 

Oregonians. When 


“If there isa the state senate 
wealthy person approved the meas- 
in Oregon who ure by a vote of 
hasn’t been 28-2 in March 2014, 
askedtomakea  Druker began to 
gift, Iwould be believe that Knight’s 
shocked.” challenge could be 
met. 


But two months later, the campaign hit 
a public-relations snag when its advertis- 
ing — designed to be catchy and blunt — sug- 
gested that Druker’s work on Gleevec had 
“cured” CML. The pitches angered some 
people with CML, who must take expensive 
drugs for the rest of their lives while enduring 
side effects and the fear that their cancer will 
become resistant to treatment. Patients said 
that calling Gleevec a cure would slow the 


search for better therapies. Druker issued an 
apology and OHSU toned down the adverts 
to read: “That’s one cancer down. Now we're 
going after other cancers as aggressively as 
they come after us.” 

With the money now in hand, it is time 
for Druker, OHSU and the cancer centre to 
deliver on that promise. Druker aims to rap- 
idly hire up to 30 principal investigators, and 
to provide researchers with a funding cush- 
ion intended to free them from the burden 
of constantly applying for grants. But the 
investigators will also be expected to meet 
research milestones. “We want to make 
progress as quickly as we can,” he says. 

The institute will focus on detecting 
cancers early in their development, when 
treatments generally have a better chance of 
success. Druker also wants the institute to 
take advantage of emerging technologies to 
develop better tests that would reduce false 
diagnoses. 

He is eager to turn his full attention to the 
science, but already feels nostalgic about the 
past two years. “It’s been busy,” he says. “But 
it’s been quite a ride.” = 
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Curator Vicki Funk of the US National Herbarium displays one of the collection’s 5 million specimens. 


Plant collections 
get pruned back 


North America’s herbaria wilt under budget pressure. 


BY BOER DENG 


erbaria in North America are withering 
He=: Collections of preserved plant 

specimens that have been accumulat- 
ing for a century or more are being closed and 
consolidated as tight budgets and competition 
for space put pressure on universities to direct 
resources to facilities such as labs. 

More than 100 North American herbaria 
have closed since 1997, leaving just over 
600 remaining. The latest casualty came in 
May, when the University of Missouri in 
Columbia announced that it will close its 
Dunn-Palmer Herbarium, a 119-year-old 
collection of more than 170,000 plants and 
thousands of mosses, algae and fungi. 

There is a perception that herbaria are dead 
places, says plant biologist Kathleen Pryer, who 
manages the herbarium at Duke University in 
Durham, North Carolina. But far from being 
relics, botanists argue, these repositories of 
preserved specimens are relevant to today’s 
research. 

For instance, DNA from specimen plants 
helps botanists to improve the accuracy of 
phylogenetic trees, and surveys of when and 
where specimens were collected can show the 
effects of climate change on species range. Ecol- 
ogists and conservationists will always need to 
be able to distinguish thorn from thistle in the 
field, says biologist Roxanne Keller of the Uni- 
versity of Nebraska Omaha. Digital archives are 
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useful, she says, but only with the real thing can 
you feel the points ofa bristle or trace a tendril’s 
curl. “You can’t get those details from a picture.” 
That sensory experience may be less valued 
these days because many botanists now find 
themselves small players in broader biological- 
sciences departments. Few outside their field 
appreciate the merits of having specimens on 
hand. Department heads and deans are always 
“mystified” about herbaria, Pryer says. 
Herbaria can feel more antique than avant 
garde. The US National Herbarium in Wash- 
ington DC houses a preserved cutting from the 
first Concord grape, a commercially important 
US breed first cultivated in 1849. The label iden- 
tifying a sunflower from South America, brown 
with age, is written in the spidery Cyrillic scrawl 
ofa nineteenth-century Russian collector. 
That fusty feel belies the present-day ques- 
tions that the specimens are being tapped to 
tackle. Isotopic analyses of specimens of the 
rainforest species Humiria balsamifera that 
date as far back as 1788, for example, show that 
as atmospheric carbon dioxide levels increased 
with industrialization, the plants responded by 
increasing photosynthetic activity and using 
water more efficiently (D. Bonal et al. Plant Cell 
Environ. 34, 1332-1344; 2011). The findings are 
important to climate modellers and others who 
want to predict how ecosystems will respond as 
CO, levels rise in coming decades. 
Researchers have also used specimens col- 
lected during the first few decades of the 
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twentieth century to track the spread of cheat 
grass (Bromus tectorum), an invasive species 
from Europe, throughout the US southwest. 
The pattern supports a growing body of evi- 
dence that successful invasions require multiple 
introductions of an exotic species (A. R. Pawlak 
et al. Biol. Invasion. 17, 287-306; 2015). 

Herbaria do not necessarily disappear 
altogether when they close. Their specimens 
are often absorbed by other institutions: the 
Rancho Santa Ana Botanic Garden in Clare- 
mont, California, for example, has taken in at 
least three other collections since 2000. In one 
case, staff had to race against an impending 
rainstorm to rescue specimens from a load- 
ing dock where they had been unceremoni- 
ously dumped. “We more or less had to drop 
everything and go and fetch it,” recalls Lucinda 
McDade, the garden's executive director. 

Drama also surrounded the 2004 transfer 
of the herbarium at the University of Iowa in 
Iowa City to Iowa State University in Ames. A 
lawsuit tried to stop the move, but eventually 
more than 200,000 specimens were packed up 
and driven the 200 kilometers to Iowa State. 
And last year, the Brooklyn Botanic Garden in 
New York said that it would sell the building 
that housed its herbarium. In April, the garden 
lent its collection to the New York Botanical 
Garden until room can be found for it else- 
where. But some curators worry that the move 
out of Brooklyn will prove permanent. 

Merging collections can have benefits, says 
James Miller, vice-president of science and 
conservation at the Missouri Botanical Gar- 
den in St Louis, which will absorb the Dunn- 
Palmer collection. Samples can be better 
curated at larger institutions, and might catch 
more researchers’ attention. But taking in an 
orphaned collection is a mixed blessing. It can 
take years to catalogue the new samples, mak- 
ing it difficult to access them for study. And 
institutions must also find a way to do that 
work at a time of dwindling funds and staff 
cuts. “I’m glad we're getting new specimens,” 
says Miller. “But a part of me is always sad 
when another herbarium closes.” 

Compared with other biological sciences, 
botanists feel that they have long struggled 
for respect. In 1988, 72% of the 50 top-funded 
US universities offered advanced degrees in 
botany. More than half of those programmes 
have been jettisoned, even though the need 
for soil and plant scientists is expected to rise 
modestly over the next decade, according to 
the US Bureau of Labor Statistics. “Getting 
people interested in living plants is a chal- 
lenge,” says Pryer. Convincing them of the 
importance of keeping flattened, wizened 
sprigs is even tougher. 

But it is important to do so, botanists say. “A 
lot of us got started studying plants by wan- 
dering into the college herbarium by accident,” 
says Vicki Funk, a botanist and curator at the 
US National Herbarium. “What happens if 
they all get carted off?” m 
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Researchers pin down risks 
of low-dose radiation 


Large study of nuclear workers shows that even tiny doses slightly boost risk of leukaemia. 


BY ALISON ABBOTT 


to quantify the risks of very low doses of 

ionizing radiation — the kind that might 
be received from a medical scan, or from living 
within a few tens of kilometres of the damaged 
Fukushima nuclear reactors in Japan. So small 
are the effects on health — if they exist at all 
— that they seem barely possible to detect. A 
landmark international study has now provided 
the strongest support yet for the idea that long- 
term exposure to low-dose radiation increases 
the risk of leukaemia, although the rise is only 
minuscule (K. Leuraud et al. Lancet Haematol. 
http://doi.org/5s4; 2015). 

The finding will not change existing guide- 
lines on exposure limits for workers in the 
nuclear and medical industries, because those 
policies already assume that each additional 
exposure to low-dose radiation brings with it a 
slight increase in risk of cancer. But it scuppers 
the popular idea that there might be a threshold 
dose below which radiation is harmless — and 
provides scientists with some hard numbers to 
quantify the risks of everyday exposures. 

“The health risk of low-dose radiation is 
really very tiny, but the public is very con- 
cerned,’ says Bill Morgan, who heads a systems- 
biology programme in low-dose radiation at the 
Pacific Northwest National Laboratory in Rich- 
land, Washington, and chairs the committee 
on radiation effects at the International Com- 
mission on Radiological Protection (ICRP) in 
Ottawa, Canada. That concern has driven a lot 
of investment in programmes trying to quantify 
the risk, he says. The European Commission, 
for example, has a 20-year road map to assess 
the problem. “We don’t do a very good job of 
explaining ourselves to the public, which finds 
ithard to put radiation risks in context — some 
people go to radon spas to treat their rheuma- 
tism while others won't board planes for fear of 
cosmic rays,’ he adds. 


i. decades, researchers have been trying 


RADIATION RISKS 

Ionizing radiation — the kind that can pull 
electrons from atoms and molecules and break 
DNA bonds — has long been known to raise 
the risks of cancer; the higher the accumulated 
dose, the greater the damage. But it has proved 
extremely difficult to determine whether this 
relationship holds at low doses, because any 


RISING BACKGROUND 


A rise in medical scans over the past two decades has doubled the amount of radiation that the average 
American receives each year. 


Germany 2005 


: ™ Other 
» ™@ Consumer products 

» ml Medical 

: m Ingestion 

: @ Terrestrial radionuclides 
» ml Cosmic rays 

: m Radon 


Average public radiation exposure (millisieverts per year) 


increase in risk is so small that to detect it 
requires studies of large numbers of people for 
whom the dose received is known. A study of 
more than 300,000 nuclear-industry workers 
in France, the United States and the United 
Kingdom, all of whom wore dosimeter badges, 
has provided exactly these data. A consortium 
of researchers coordinated by the International 
Agency for Research on Cancer (IARC) in 
Lyon, France, examined causes of death in the 

workers (one-fifth of 


“The health whom had died by the 
risk of low- time of the study) and 
dose radiation correlated this with 
is really very exposure records, 
tiny, but the some of which went 
publicis very back 60 years. 

concerned.” The workers 


received on aver- 
age just 1.1 millisieverts (mSv) per year above 
background radiation, which itself is about 
2-3 mSv per year from sources such as cosmic 
rays and radon. The study confirmed that the 
risk of leukaemia does rise proportionately with 
higher doses, but also showed that this linear 
relationship is present at extremely low levels of 
radiation. (Other blood cancers also tended to 
rise with radiation doses, but the associations 
were not statistically significant.) The results 
were published on 21 June. 

“Ttis a solid, unusually large study of individ- 
uals exposed to very low doses of ionizing radia- 
tion,” says epidemiologist Jorgen Olsen, director 
of the Danish Cancer Society Research Center 
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in Copenhagen. The finding implies that some 
cases of leukaemia will even be caused by a high 
level of natural background radiation, he adds, 
“though the increased risk for an individual is 
going to be vanishingly small” 

ICRP recommendations, which most 
national radiation-protection agencies follow, 
already call for monitoring of individuals whose 
annual exposure is likely to exceed 6 mSv. They 
restrict exposure to 20 mSv annually over 5 
years, with a maximum of 50 mSv in any one 
year. Researchers expected that 134 of the work- 
ers (4.3 per 10,000 people) would die from leu- 
kaemia as a result of the average 27 years they 
spent in the industry; in fact, 531 people died 
from the disease. Even in this large study, there 
was no direct evidence that workers who had 
accumulated extremely low doses of radiation 
(below a total of 50 mSv) had an increased risk 
of leukaemia, says Olsen. But a mathematical 
extrapolation of the data suggests that each 
accumulation of 10 mSv of exposure raises a 
worker's risk of leukaemia by 0.002%. 

The data also challenge an ICRP assump- 
tion that accumulated low-dose exposure gives 
a lower risk of leukaemia than does a single 
exposure to the same total dose (based on 
the idea that the body has time to recover if 
the assault comes in tiny, spread-out doses). 
But such details are unlikely to change the 
overall ICRP recommendations, which are 
deliberately conservative, says Thomas Jung, 
from Germany’s Federal Office for Radiation 
Protection in Munich. > 
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> MEDICAL SCANS 

A major, and increasing, source of low- 
dose radiation comes from the medical 
world, says David Richardson, an epidemi- 
ologist at the University of North Carolina 
and an author of the study. “The amount 
of radiation a US person receives in a year 
on average has doubled, mostly because of 
medical procedures,” he says (see ‘Rising 
background’). Computed-tomography 
(CT) scans are to blame for most of the rise; 
a typical abdominal scan delivers more 
than 10 mSv. Radiologist David Brenner of 
Columbia University in New York has cal- 
culated that of the 25 million people hav- 
ing CT scans ina year, 1 million will have 
accumulated more than 250 mSv over the 
previous 20 years. 

One group that needs to pay particular 
attention to the findings are the tens of 
thousands of health workers who use radio- 
logical imaging to guide catheters through 
blood vessels of patients to reach into their 
hearts and brains, says Martha Linet, at the 
US National Cancer Institute's radiation epi- 
demiology programme in Bethesda, Mary- 
land. These minimally invasive operative 
procedures are used ever more frequently, 
she says. 

Epidemiological studies suggest that 
radiation exposure has health effects beyond 
cancer. The IARC-led consortium is now 
looking at the effect on solid cancers, and 
also on diseases such as heart attack and 
stroke. Other studies are under way to study 
the long-term impact of low-dose radia- 
tion on different cohorts. One, the Epi-CT 
study, is recruiting one million people from 
nine European countries who had CT scans 
as children; its analysis will be complete 
by 2017. In another, the Helmholtz Center 
Munich is analysing heart tissue from work- 
ers who died in the Mayak uranium mines in 
the South Urals, Russia. 

Although the European Commission has 
been funding research on low-dose radia- 
tion for some time, equivalent programmes 
in the United States have stalled. In 2013, 
scientists wrote an open letter to the White 
House Office of Science and Technology 
Policy calling for renewed investment, and 
a billis currently being debated in Congress 
calling for more work. 

Getting funding for such studies is 
important, says Mike Atkinson, head of 
radiation biology at the Helmholtz Center 
Munich. Being able to quantify the effects 
of radiation will help doctors to balance risk 
against benefit when deciding whether to 
put children in CT scanners, he says. And 
further understanding the health impacts 
of low-dose radiation might aid decisions 
about how much remedial activity is needed 
to clean up soil contaminated by radioactiv- 
ity from accidents or nuclear-power works, 
says Morgan. m 
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In the Netherlands, concerns about rising sea levels have led citizens to sue to force emissions cuts. 


Courts weigh in 
on climate change 


Successful Dutch climate litigation may encourage action 
across Europe, but US courts seem unlikely to follow suit. 


BY QUIRIN SCHIERMEIER 


group of Dutch citizens weary of 
Am climate diplomacy are cele- 
brating after forcing change through 
legal action. Last week, following a lawsuit filed 
bya citizens’ climate-change platform called 
the Urgenda Foundation, a court in The Hague 
ordered the government of the Netherlands to 
cut greenhouse-gas emissions to at least 25% 
below 1990 levels by 2020 — substantially 
greater cuts than are required under the small 
country’s European Union (EU) obligations. 
The ruling could encourage citizens of other 
countries to try using legal avenues to force 
stricter climate policies, says James Thornton, 
the London-based chief executive of Client 
Earth, an international group of environmen- 
tal lawyers. “This is a very powerful decision 
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with possible far-reaching repercussions,’ he 
says. “It is forcing the use of undisputed scien- 
tific results for responsible policy-making — a 
very remarkable step.” 

The Dutch government may still appeal the 
ruling, and even if it does have to implement 
extra emissions cuts, these would barely dent 
global greenhouse-gas emissions. But the court 
made clear that although Dutch policy-makers 
can do little to reduce emissions in China or 
the United States, they still have an obligation 
to act out of a duty of care for their citizens. 

Thornton hopes that other courts will judge 
similar lawsuits in the same way in future. 
One such case is pending in Belgium, which 
must reduce its emissions by only 15% below 
2005 levels under current EU pledges. But it is 
unclear whether the landmark Dutch ruling, 
and any European lawsuits that might follow, 


FRANS LEMMENS/ALAMY 


will make waves in other parts of the world — 
particularly in the United States. 

In 2007, the US Supreme Court authorized 
the Environmental Protection Agency (EPA) 
to regulate greenhouse-gas emissions that 
contribute to air pollution, because pollution 
could endanger public health or welfare. A 
series of greenhouse-gas reduction plans have 
followed. But attempts to get federal courts to 
order more-substantial cuts have so far come 
to nothing. Four years after the EPA decision, 
the Supreme Court rejected an effort by Cali- 
fornia and five other states to seek a cap on 
emissions from the utilities sector. The states 
argued that greenhouse gases are a ‘public nui- 
sance’; however, the court countered that the 
EPAs authority to regulate emissions prevented 
federal judges from using the public-nuisance 
argument. Attempts by others to claim liability 
against polluters and seek damages under civil 
law have also been unsuccessful. 


LIMITED POWER 
In the United States, “there is no federal con- 
stitutional right to environmental protection’, 
says Richard Stewart, an environmental-law 
specialist at New York University. “Some state 
courts may recognize such a right, but the rem- 
edy might at best be limited to local sources.’ 
That seems to be the case in Washington 
state, where on 23 June, a Seattle court ordered 


the state's ecology department to reconsider a 
2014 petition brought by eight school students 
to limit the state's carbon dioxide emissions. 
The petition called for the agency to act in line 
with what scientific evidence says is needed 
to protect the climate and the environment. 
The agency initially denied the petition, but 
has been ordered to report back to the court 
by 8 July. Petitioners’ lawyer Andrea Rodgers, 
of the Western Environmental Law Center in 

Seattle, said that it 


“Thisis a very was the first time a 

powerful US court had ordered 
decision with a state agency to con- 
far-reaching sider the most current 
repercussions. il and best available cli- 
mate science in decid- 


ing regulation on carbon emissions. 

It would be unusual for a US court to demand 
a specific level of federal emissions regulation, as 
has happened in the Netherlands, says Michael 
Oppenheimer, who studies geosciences and 
international affairs at Princeton University in 
New Jersey. A court would be likely to do so only 
if there were a large gap between public safety 
and existing regulations, he says. 

“Tf it became clear that US regulations, along 
with actions of other countries, are insufficient, 
then at some future date a court might invoke 
the objective to force stronger action,” he says. 
But, adds Oppenheimer, current US targets are 
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consistent with “at least some pathways” that 
would keep the world’s warming below 2°C, 
the internationally recognized threshold for 
‘dangerous’ climate change. = 


CORRECTIONS 

The News story ‘Election results delight 
scientists’ (Nature 522, 264-265; 2015) 
stated that Gencay Giirsoy won a seat in the 
new Turkish parliament for the HDP. He did 
not; he is a member of the HDP assembly. 
The News story ‘Earth science wrestles with 
conflict-of-interest policies’ (Nature 522, 
403-404; 2015) erroneously stated that 
hydrologist Donald Siegel disclosed the 
provision of water samples by Chesapeake 
Energy Corporation only in a correction 

to his article. In fact, this information was 
included in the acknowledgements of his 
original paper. 


CLARIFICATION 

The News story ‘Earth science wrestles with 
conflict-of-interest policies’ (Nature 522, 
403-404; 2015) did not make clear that 
Siegel’s findings related to gas production in 
general, and not just the process known as 
fracking. This has been clarified in the online 
version of the story. 
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WEIGHING THE 


WORLD $ 
TREES 


Researchers are racing to 
determine whether forests 
will continue to act asa 
brake on climate change by 
soaking up more carbon. 


BY GABRIEL POPKIN 
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measure around a young tulip tree. He jots the reading down ina 

field notebook, marks the tree with blue chalk and moves on to the 
next trunk. Parker spends about 10 seconds on each tree. Wrap, meas- 
ure, record. Since 1987, he and others have logged more than 300,000 
tree measurements at their plots in the Smithsonian Environmental 
Research Center (SERC) near Edgewater, Maryland. 

This 1,070-hectare site is filled with tulip trees, oaks, beeches and 
other mostly deciduous trees. Some stout specimens have stood here 
for centuries. Others are just a decade old, sprouting from land that was 
recently logged. To keep tabs on the growth, the researchers measure 
their trees every three to five years. 

All that patient record-keeping can help to answer two major ques- 
tions about climate change: how much carbon dioxide pollution are 
forests mopping up, and will their capacity shrink over time? Studies 
from Parker’s group and others reveal that trees around the globe are 
going through a growth spurt and are absorbing billions of tonnes of 
the greenhouse gas, meaning that forests are putting a brake on global 
warming. But there is no guarantee that forests will keep that up, Parker 
says. “I think of it like these performance enhancers that some stellar 
athletes use: it bumps up performance, but not for ever.” 

In fact, studies of some regions suggest that forest growth may already 
be slowing down. And humans are adding to the problem by cutting 
down trees, especially in tropical forests. Getting an accurate reading 
on the status of Earth’s forests is hard because scientists cannot wrap 
measuring tapes around the roughly 400 billion trees scattered across 
the planet. So researchers are exploring ways to track forest growth more 
efficiently, using planes and satellites. And they are feeding all of their 
data into sophisticated computer models that are designed to forecast 
how trees will respond in the future. 

Such forest measurements are sorely needed as nations wrestle with 
how to slow climate change. Some plans call 
for wealthy governments or private companies Redwood trees can 
to pay poorer nations in return for safeguard- store carbon for more 
ing the carbon in their forests. With a major __ than 2,000 years. 


] na forest just west of Chesapeake Bay, Geoffrey Parker wraps a tape 
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international climate negotiation approaching later this year, and bil- 
lions of dollars in forest payments potentially on the table, scientists are 
racing to advise countries and other stakeholders about just how much 
carbon trees are storing, and how long that carbon will stay locked up. 

“The critical thing that matters is to what extent the biosphere 
remains a brake on the rate of global climate change,” says Yadvinder 
Malhi, a forest ecologist at the University of Oxford, UK. That brake 
will weaken or disappear if forests take up carbon more slowly. Worse, 
if forests start emitting more carbon than they absorb each year, they 
could become an accelerator. If that were to happen, says Malhi, “it 
makes it all the more challenging for us to bring CO, down to avoid 
some threshold of dangerous climate change”. 


THE MISSING SINK 

In the 1990s, researchers stumbled across a mystery when they tried to 
track down all of the carbon humans were emitting by burning fossil 
fuels. Measurements showed that roughly three-quarters of the CO, 
was accumulating in the atmosphere and oceans. The remainder was 
presumably captured on land, but no one knew where it was going. The 
problem became known as the ‘missing sink. 

The world’s forests, which pull carbon out of the air through photo- 
synthesis, were a possible hiding place. Today, they collectively hold 
around 650 billion tonnes of carbon, and it seemed plausible that they 
could be mopping up the missing carbon. 

But ecologists were slow to acknowledge that forests could be the miss- 
ing sink. The community's reticence resulted largely from the work of 
pioneering ecologist Eugene Odum. He argued in the late 1960s that 
undisturbed ecosystems rapidly reach an equilibrium, after which they 
lose as much carbon through respiration, death and decay as they gain 
through photosynthesis’. Without much evidence to the contrary, Odum’s 
paradigm held sway for several decades. “Mathematically, it's convenient if 
something is in equilibrium,’ says Sebastiaan Luyssaert of the Laboratory 
for Climate Sciences and the Environment in Gif-sur-Yvette, France. “We 
were happy with it, because it made life easier” 

That started to change as ecologists analysed long-term data from big 
networks of forest research plots. Many of the measurements came from 
a trio of projects: the Amazon Forest Inventory Network (RAINFOR), the 
African Tropical Rainforest Observation Network (AfriTRON) and the 
Smithsonian's Forest Global Earth Observatories (ForestGEO) network, 
which includes the SERC forest and 61 other plots around the world. 

Starting in the late 1990s, scientists with the RAINFOR and Afri- 
TRON networks began reporting that intact tropical forests were add- 
ing biomass, contradicting Odum's hypothesis. At the Chesapeake site, 
Smithsonian ecologist Sean McMahon and his colleagues analysed 
22 years’ worth of data and found that tree stands of all ages were grow- 
ing two to four times faster than expected’. The tree growth records 
are backed up by CO, measurements taken on tall towers at more than 
20 sites in North America and Europe: these ‘flux towers’ have revealed 
that many forests are absorbing more CO, than they are giving off”. 

Researchers suspect several factors are at play. Because trees require 
CO, for photosynthesis, the atmospheric build-up of this gas can ferti- 
lize plants, allowing them to grow faster. Also, CO, warms the planet, 
which can lengthen the growing seasons of trees and speed up temper- 
ature-dependent processes involved in growth. Scientists are currently 
teasing out which factors have the largest roles. 

Whatever the cause, all that accelerated growth is having a major 
effect on the global carbon cycle. In 2011, an international team led 
by US Forest Service researchers Yude Pan and Richard Birdsey con- 
cluded that the world’s trees had sequestered enough carbon during 
the period from 1990 to 2007 to account for the entire missing sink*. 
The hungriest carbon absorbers were the temperate forests, particularly 
areas where abandoned farmland had given way to young, fast-growing 
trees. High-latitude boreal forests ate up a smaller amount, and tropical 
forests, on balance, were not taking up carbon because tropical defor- 
estation released about as much CO, as forests were soaking up. The 
team projected that if deforestation were halted, Earth's forests could 
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take up around half of the carbon emitted by human activity, which 
would substantially slow down global warming. 

But the uncertainties in these estimates are large because forest data 
are sparse and vary widely in quality. Many countries have no systematic 
forest inventory system or do not share their data. In their analysis, Pan 
and Birdsey relied largely on RAINFOR and AfriTRON for assessing the 
globe’s old-growth tropical forests. These networks collectively sample 
just a few square kilometres in the Amazon and Africa; they have no data 
from the large and diverse tropical forests of southeast Asia. 

Beyond determining the size and location of the forest sink, scientists 
are trying to assess whether it is changing. In March, the RAINFOR 
team analysed more than 850,000 measurements of approximately 
189,000 individual trees and found that the large Amazon forest carbon 
sink seems to be shrinking’. Carbon uptake in their plots during the past 
decade was one-third smaller than during the 1990s. 


The researchers suspect multi- 
l TH | NK 0 F IT ple factors might be at play. Major 
LIKE THESE 


droughts that hit the Amazon in 
2005 and 2010 could have slowed 
tree growth during this period. 
p F RFO R M A N C F Meanwhile, rising temperatures 
and CO, levels may be accelerating 
So M F STE LL A R says Roel Brienen, an ecologist at 
the University of Leeds, UK, and 
x lead author of the study. 

AT H LET ES U S F . Some other researchers are not 
convinced by the evidence. Helene 
Muller-Landau, an ecologist at the Smithsonian's Tropical Forest Research 
the Amazonian carbon sink during the 1990s. The group’ plots, she says, 
sample too small an area — just three square kilometres out of the vast 
two-million-square-kilometre Amazon — to support its broad claims. “If 

you actually look at the area covered, it’s just so pitifully small,” she says. 
There can also be bias in how researchers have typically chosen plots 
and measured biomass, Muller-Landau says. Tropical forests can be 
sites based on ease of access. And biomass estimates vary depending on 
the choice of species-specific equations used to convert circumference 
and height measurements; for many tropical trees, reliable equations 
are still being worked out. 
Although no one doubts that forests are taking up some of the CO, 
emitted by human activity, scientists are still unsure which forests are 


the life cycles of trees: if so, trees are 

now dying earlier than expected, 
Institute on Barro Colorado Island in Panama, thinks that the RAINFOR 
group is finding an apparent decline now largely because it overestimated 
hot, humid, buggy, dangerous and in some cases nearly impossible to 
reach. So rather than sample randomly, scientists often choose study 
sequestering the most carbon, and how much is stored in long-lasting 
wood versus in roots and soil. 


HELP FROM ABOVE 

Researchers will only ever be able to measure a tiny fraction of the 
world’s trees by wrapping tapes around them one at a time, so they are 
taking to the skies to get a broader perspective. Some planes and satel- 
lites are outfitted with laser-based lidar systems that measure the height 
of the tree tops. Scientists can then estimate an area’s biomass by using 
the forest’s average canopy height and tree type. 

Plane-mounted lidar can collect data for 35,000 hectares in one 
hour, says Gregory Asner, an ecologist with the Carnegie Institution for 
Science in Stanford, California. The uncertainties in his lidar-based 
forest biomass estimates are now down to around 10%, comparable 
to those from ground-based studies, he says, although others say the 
uncertainties in both types of estimates are larger. 

For a truly global view, scientists agree that nothing beats a satel- 
lite. Current Earth-observing satellites lack the resolution of plane- or 
ground-based measurements but can fill in areas where data are scarce 
or non-existent. NASA’s Orbiting Carbon Observatory-2 (OCO-2), 
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Networks of research sites around the globe indicate that forests absorb and store about one-quarter of the roughly 10 billion 


tonnes of carbon emitted by burning fossil fuels each year. But the size of that carbon sink may be shrinking. 


The Smithsonian 
Environmental Research Center 
is part of the Forest Global 
Earth Observatory (ForestGEO) 


4 “ network, which includes 62 
s forest plots in 24 countries. 


i 


Researchers will soon 
start dosing a patch 
of the Amazon with 
extra carbon dioxide 
to see how this 
tropical forest might 


fare in the future. 
Data from the Amazon Forest 


g Inventory Network (RAINFOR) 
indicate that trees are 
absorbing less carbon overall 
than in decades past. 


launched in July 2014, will soon provide fresh data to help locate the 
missing sink. The satellite uses spectrometers to measure concentrations 
of CO, to within a few parts per million, allowing scientists to pinpoint 
the locations where carbon is being emitted and sequestered (see “Tree 
tales’). A separate instrument can determine how much photosynthesis 
is occurring at a specific location. Although OCO-2 does not measure 
tree biomass directly, it will provide enough data for scientists to deter- 
mine how much carbon is entering and leaving different ecosystems. 
NASA expects to release preliminary results from the satellite by the 
end of the year, but it will be at least several years before the data can 
address whether forest sinks are changing. And even then, the OCO-2 
measurements won't answer whether carbon is going into trees, soil or 
somewhere else, so ground-based observations will still be needed, says 
David Crisp, chief scientist for OCO-2. 


TOMORROW’S TREES 

Other scientists seeking to predict the carbon sink’s future are turning 
the clocks forward — with experiments that expose today’s forests to 
future conditions. One strategy involves piping CO, into a forest to 
raise concentrations from the present 400 parts per million to roughly 
550 parts per million — a level expected before this century’s end. 

In experiments in the United States and Europe, trees dosed with 
extra CO, grew faster, just as expected. But the effects often did not last. 
One explanation is that enhanced trees may quickly use up other vital 
nutrients, such as nitrogen, says ecologist Richard Norby at Oak Ridge 
National Laboratory in Tennessee, who led one of the experiments. 

Researchers from the United States, the United Kingdom and Brazil 
are now building a CO,-enrichment experiment near Manaus, Brazil (see 
Nature 496, 405-406; 2013), which they hope to start next year. That 
experiment will provide valuable information about trees in the tropics, 
but it will not represent the future of all forests in that region, says Simon 
Lewis, an ecologist at the University of Leeds and University College 
London who is involved in the RAINFOR and AfriTRON networks. The 
region around Manaus has poorer soils than other parts of the Amazon 
so trees grow more slowly, says Lewis, and “it will take longer for the 
impacts to be seen”. 

In the meantime, researchers are trying other methods to peer into 
the future. Some 20 teams have built Earth-system models that seek to 
simulate the climate and vegetation on the planet, including how carbon 
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The African Tropical 

Rainforest Observation 
y Network (AfriTRON) 

has forest plots 

across Africa's tropics 

in 13 nations. 


Forests 


4 WAYS TO MEASURE 


Plane- or satellite- 
based lidar laser 
systems can measure 
the height of the 
treetops, which 
indicates how much 
carbon the forest holds. 


The Orbiting Carbon 
Observatory-2 
satellite can gauge 
the amount of carbon 
absorbed and 
released by forests. 
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Tape measures have 
conventionally been 
used to measure the 
growth of trees. 
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moves between the oceans, atmosphere and continents. These models 
currently represent forests in a simplified manner, and they disagree 
about the future. Some predict that forests will continue to soak up 
massive amounts of carbon in coming decades, whereas others suggest 
that forests could become stressed by droughts and high temperatures 
and die back, releasing carbon into the atmosphere. 

The emerging insights about forests — from individual tree measure- 
ments to satellite data to computer simulations — will all playa part in 
how countries decide to manage their resources. And that has implica- 
tions for global climate negotiations because some carbon-reduction 
schemes rely on rewarding nations for keeping carbon locked up in 
forests. For that to work, researchers will need to find reliable ways 
to track the changing amounts of forest carbon. The current level of 
uncertainty in forest biomass estimates “does not exactly provide a lot of 
confidence’, Muller-Landau says. “Having something verifiable would 
have to be fairly key” for carbon accounting, she adds. 

To that end, scientists such as Parker are developing more precise ways 
to monitor trees growing in their experimental plots. On a cloudy spring 
day at the Smithsonian’s Chesapeake site, Parker directs volunteers to 
install spring-tensioned steel bands called dendrometers in a 130-year- 
old stand. As the tree trunks expand over time, they will widen gaps in 
the bands, which can be measured using digital calipers. The technique 
can track changes down to a hundredth of a millimetre — thinner than 
a human hair — giving researchers an unprecedented ability to study 
growth patterns. The method can even detect how trees swell and con- 
tract over a few hours as they absorb or lose water. 

By the end of the day, Parker’s team has finished attaching several 
more dendrometers. More than a thousand trees at the Smithsonian cen- 
tre now sport the metal rings, and their number is increasing around the 
world. Parker puts his equipment in a truck and drives off towards home. 
But he and his crew will be back soon to check how their trees are respond- 
ing as Earth’s climate — and its forests — enter uncharted territory. = 


Gabriel Popkin is a freelance writer in Mount Rainier, Maryland. 
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| THE ROBOT'S DILEMMA 


Working out how to build ethical robots is one of 
the thorniest challenges in artificial intelligence. 


BY BOER DENG 
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PETER ADAMS 


nhis 1942 short story ‘Runaround; science-fiction writer Isaac 
Asimov introduced the Three Laws of Robotics — engineering 
safeguards and built-in ethical principles that he would go on to 
use in dozens of stories and novels. They were: 1) A robot may 
not injure a human being or, through inaction, allow a human 
being to come to harm; 2) A robot must obey the orders given 
it by human beings, except where such orders would conflict with 
the First Law; and 3) A robot must protect its own existence as long 
as such protection does not conflict with the First or Second Laws. 

Fittingly, ‘Runaround’ is set in 2015. Real-life roboticists are 
citing Asimov's laws a lot these days: their creations are becoming 
autonomous enough to need that kind of guidance. In May, a panel 
talk on driverless cars at the Brookings Institution, a think tank in 
Washington DC, turned into a discussion about how autonomous 
vehicles would behave in a crisis. What if a vehicle's efforts to save 
its own passengers by, say, slamming on the brakes risked a pile-up 
with the vehicles behind it? Or what if an autonomous car swerved 
to avoid a child, but risked hitting someone else nearby? 

“We see more and more autonomous or automated systems in 
our daily life,” said panel participant Karl-Josef Kuhn, an engi- 
neer with Siemens in Munich, Germany. But, he asked, how can 
researchers equip a robot to react when it is “making the decision 
between two bad choices”? 

The pace of development is such that these difficulties will soon 
affect health-care robots, military drones and other autonomous 
devices capable of making decisions that could help or harm 
humans. Researchers are increasingly convinced that society’s 
acceptance of such machines will depend on whether they can be 
programmed to act in ways that maximize safety, fit in with social 
norms and encourage trust. “We need some serious progress to 
figure out what’s relevant for artificial intelligence to reason suc- 
cessfully in ethical situations,” says Marcello Guarini, a philosopher 
at the University of Windsor in Canada. 

Several projects are tackling this challenge, including initiatives 
funded by the US Office of Naval Research and the UK govern- 
ment’s engineering-funding council. They must address tough 
scientific questions, such as what kind of intelligence, and how 
much, is needed for ethical decision-making, and how that can be 
translated into instructions for a machine. Computer scientists, 
roboticists, ethicists and philosophers are all pitching in. 

“If you had asked me five years ago whether we could make ethi- 
cal robots, I would have said no,’ says Alan Winfield, a roboticist 
at the Bristol Robotics Laboratory, UK. “Now I don’t think it’s such 
acrazy idea.” 


LEARNING MACHINES 
In one frequently cited experiment, a commercial toy robot called 
Nao was programmed to remind people to take medicine. 

“On the face ofit, this sounds simple,’ says Susan Leigh Anderson, 
a philosopher at the University of Connecticut in Stamford who did 
the work with her husband, computer scientist Michael Anderson 
of the University of Hartford in Connecticut. “But even in this kind 
of limited task, there are nontrivial ethics questions involved” For 
example, how should Nao proceed if a patient refuses her medica- 
tion? Allowing her to skip a dose could cause harm. But insisting 
that she take it would impinge on her autonomy. 

To teach Nao to navigate such quandaries, the Andersons gave 
it examples of cases in which bioethicists had resolved conflicts 
involving autonomy, harm and benefit to a patient. Learning algo- 
rithms then sorted through the cases until they found patterns that 
could guide the robot in new situations’. 

With this kind of ‘machine learning, a 
robot can extract useful knowledge even 
from ambiguous inputs (see go.nature. 
com/2r7nav). The approach would, in 
theory, help the robot to get better at ethical 


The fully programmable 
Nao robot has been 
used to experiment 
with machine ethics. 


decision-making as it encounters more situations. But many fear that 
the advantages come at a price. The principles that emerge are not 
written into the computer code, so “you have no way of knowing why 
a program could come up with a particular rule telling it something 
is ethically ‘correct’ or not’, says Jerry Kaplan, who teaches artificial 
intelligence and ethics at Stanford University in California. 

Getting around this problem calls for a different tactic, many engi- 
neers say; most are attempting it by creating programs with explicitly 
formulated rules, rather than asking a robot to derive its own. Last 
year, Winfield published the results” of an experiment that asked: 
what is the simplest set of rules that would allow a machine to rescue 
someone in danger of falling into a hole? Most obviously, Winfield 
realized, the robot needed the ability to sense its surroundings — to 
recognize the position of the hole and the person, as well as its own 
position relative to both. But the robot also needed rules allowing it 
to anticipate the possible effects of its own actions. 


“We need some serious progress to 


figure out what’s relevant for artificial 


intelligence to reason successfully in 
ethical situations.” 


Winfield’s experiment used hockey-puck-sized robots moving 
ona surface. He designated some of them “H-robots’ to represent 
humans, and one — representing the ethical machine — the ‘A-robot, 
named after Asimov. Winfield programmed the A-robot with a rule 
analogous to Asimov’ first law: if it perceived an H-robot in danger 
of falling into a hole, it must move into the H-robot'’s path to save it. 

Winfield put the robots through dozens of test runs, and found 
that the A-robot saved its charge each time. But then, to see what 
the allow-no-harm rule could accomplish in the face of a moral 
dilemma, he presented the A-robot with two H-robots wandering 
into danger simultaneously. Now how would it behave? 

The results suggested that even a minimally ethical robot could be 
useful, says Winfield: the A-robot frequently managed to save one 
‘human, usually by moving first to the one that was slightly closer 
to it. Sometimes, by moving fast, it even managed to save both. But 
the experiment also showed the limits of minimalism. In almost 
half of the trials, the A-robot went into a helpless dither and let both 
‘humans perish. To fix that would require extra rules about how to 
make such choices. If one H-robot were an adult and another were a 
child, for example, which should the A-robot save first? On matters 
of judgement like these, not even humans always agree. And often, 
as Kaplan points out, “we don't know how to codify what the explicit 
rules should be, and they are necessarily incomplete”. 

Advocates argue that the rule-based approach has one major vir- 
tue: it is always clear why the machine makes the choice that it does, 
because its designers set the rules. That is a crucial concern for the 
US military, for which autonomous systems are a key strategic goal. 
Whether machines assist soldiers or carry out potentially lethal mis- 
sions, “the last thing you want is to send an autonomous robot on a 
military mission and have it work out what ethical rules it should fol- 
lowin the middle of things”, says Ronald Arkin, who works on robot 
ethics software at Georgia Institute of Technology in Atlanta. Ifa 
robot had the choice of saving a soldier or going after an enemy com- 
batant, it would be important to know in advance what it would do. 

With support from the US defence department, Arkin is design- 
ing a program to ensure that a military 
robot would operate according to interna- 
tional laws of engagement. A set of algo- 
rithms called an ethical governor computes 
whether an action such as shooting a missile 
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‘Robear’ is designed to help to care for ill or elderly people. 


is permissible, and allows it to proceed only if the answer is ‘yes. 
In a virtual test of the ethical governor, a simulation of an 
unmanned autonomous vehicle was given a mission to strike enemy 
targets — but was not allowed to do so if there were buildings with 
civilians nearby. Given scenarios that varied the location of the vehi- 
cle relative to an attack zone and civilian complexes such as hospitals 
and residential buildings, the algorithms decided when it would be 
permissible for the autonomous vehicle to accomplish its mission’. 


“Logic is how we reason and come up 
with our ethical choices.” 


Autonomous, militarized robots strike many people as danger- 
ous — and there have been innumerable debates about whether 
they should be allowed. But Arkin argues that such machines could 
be better than human soldiers in some situations, if they are pro- 
grammed never to break rules of combat that humans might flout. 

Computer scientists working on rigorously programmed 
machine ethics today favour code that uses logical statements, such 
as ‘If a statement is true, move forward; if it is false, do not move. 
Logic is the ideal choice for encoding machine ethics, argues Luis 
Moniz Pereira, a computer scientist at the Nova Laboratory for 
Computer Science and Informatics in Lisbon. “Logic is how we 
reason and come up with our ethical choices,’ he says. 

Crafting instructions capable of the logical steps that go into mak- 
ing ethical decisions is a challenge. For example, Pereira notes, the 
logical languages used by computer programs have trouble coming 
to conclusions about hypothetical scenarios, but such counterfactuals 
are crucial in resolving certain ethical dilemmas. 

One of these is illustrated by the trolley problem, in which you 
imagine a runaway railway trolley is about to kill five innocent peo- 
ple who are on the tracks. You can save them only if you pulla lever 
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that diverts the train onto another track, where it will hit and kill an 
innocent bystander. What do you do? In another set-up, the only way 
to stop the trolley is to push the bystander onto the tracks. 

People often answer that it is all right to stop the trolley by hitting 
the lever, but viscerally reject the idea of pushing the bystander. The 
basic intuition, known to philosophers as the doctrine of double 
effect, is that deliberately inflicting harm is wrong, even if it leads 
to good. However, inflicting harm might be acceptable if it is not 
deliberate, but simply a consequence of doing good — as when the 
bystander simply happens to be on the tracks. 

This is avery difficult line of analysis for a decision-making pro- 
gram. To begin with, the program must be able to see two differ- 
ent futures: one in which a trolley kills five people, and another in 
which it hits one. The program must then ask whether the action 
required to save the five is impermissible because it causes harm, or 
permissible because the harm is only a side effect of causing good. 

To find out, the program must be able to tell what would happen 
if it chose not to push the bystander or pull the lever — to account 
for counterfactuals. “It would be as if program was constantly 
debugging itself? says Pereira — “finding where in a line of code 
something could be changed, and predicting what the outcome 
of the change would be.” Pereira and Ari Saptawijaya, a computer 
scientist at the University of Indonesia in Depok, have written a 
logic program’ that can successfully make a decision based on the 
doctrine of double effect, as well as the more sophisticated doctrine 
of triple effect, which takes into account whether the harm caused 
is the intended result of the action, or simply necessary to it. 


HUMANS, MORALS, MACHINES 

How ethical robots are built could have major consequences for 
the future of robotics, researchers say. Michael Fisher, a computer 
scientist at the University of Liverpool, UK, thinks that rule-bound 
systems could be reassuring to the public. “People are going to be 
scared of robots if they're not sure what it’s doing,” he says. “But if 
we can analyse and prove the reasons for their actions, we are more 
likely to surmount that trust issue” He is working with Winfield 
and others on a government-funded project to verify that the out- 
comes of ethical machine programs are always knowable. 

By contrast, the machine-learning approach promises robots 
that can learn from experience, which could ultimately make them 
more flexible and useful than their more rigidly programmed 
counterparts. Many roboticists say that the best way forward will 
be a combination of approaches. “It’s a bit like psychotherapy,” says 
Pereira. “You probably don’t just use one theory.’ The challenge — 
still unresolved — is to combine the approaches in a workable way. 

These issues may very soon come up in the fast-moving field of 
autonomous transport. Already, Google's driverless cars are zipping 
across parts of California (see Nature 518, 20-23; 2015). In May, 
autonomous trucks from German car-maker Daimler began driv- 
ing themselves across the Nevada desert. Engineers are thinking 
hard about how to program cars to both obey rules and adapt to 
situations on the road. “Up until now we've been trying to do things 
with robots that humans are bad at,’ such as maintaining attention 
on long drives or being quick on the brakes when the unexpected 
occurs, says Bernhard Weidemann, a spokesperson for Daimler in 
Stuttgart. “Going forward, we will have to try to program things that 
come more naturally to humans, but not to machines.” m 


Boer Deng is a news intern for Nature in Washington DC. 
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Don’t cry wolt 


Tighten the requirements for declaring physics breakthroughs, says Jan Conrad. 


he past few years have seen a slew of 
"TPentoanesment of major discoveries 
in particle astrophysics and cosmol- 
ogy. The list includes faster-than-light neutri- 
nos; dark-matter particles producing y-rays; 
X-rays scattering off nuclei underground; and 
even evidence in the cosmic microwave back- 
ground for gravitational waves caused by the 
rapid inflation of the early Universe. Most of 
these turned out to be false alarms; and in my 
view, that is the probable fate of the rest. 
There are consequences to broadcasting 
seemingly extraordinary results to peers 
and the public before they are reviewed, 
or despite knowing that better data are just 
around the corner. Colleagues who once got 
excited now shake their heads and joke about 
‘yet another dark-matter candidate’ The field 


has cried wolf too many times and lost cred- 
ibility. One colleague told me that granting 
panels are becoming wary of funding astro- 
physical searches for dark-matter particles. 

I also worry that false discoveries are 
undermining public trust in science. As cos- 
mic phenomena come and go — not to men- 
tion endless speculation about hypothetical 
concepts such as parallel and holographic 
universes — why should anyone believe that 
any scientific result will hold?’. 

Several trends have brought us to this state 
of affairs. Intense competition, increased use 
of public data sets and online publishing of 
draft papers without proper refereeing have 
eroded traditional standards for making 
extraordinary claims. 

Particle physics and astrophysics pioneered 
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the open release of data and publications 
more than two decades ago; other disciplines 
are following their lead. The scientific com- 
munity must now address the habits that have 
crept in to ensure that enticing reports of false 
discoveries do not overwhelm more sober 
accounts of genuine scientific breakthroughs. 


SHIFTING PRACTICES 

Three changes in the ways that scientific 
studies are done and reported are fuelling 
this rash of false discoveries. 

First, statistical standards have fallen. 
Extraordinary claims demand extraordinary 
proof. In particle physics, the usual threshold 
is ‘5 sigma’: a signal 5 times stronger than the 
average noise level (sigma), which translates 
to a roughly 1-in-3.5-million probability > 
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> that the results were due to chance. But 
5-sigma claims are becoming rare as scientists 
rush to assert priority with exciting but tenta- 
tive results. The July 2012 official announce- 
ment of the discovery of the Higgs boson 
with the Large Hadron Collider at CERN, 
Europe's particle-physics lab near Geneva, 
Switzerland, was preceded by press releases 
of weak but suggestive indications even 
though there was no competition. 

That scientists change the wording in their 
papers from ‘discovery’ to ‘evidence or ‘indi- 
catiow has little influence on how the results 
are used. Take the latest dark-matter discov- 
ery claim. On 8 March, astronomers posted 
a preprint of a paper in the arXiv repository, 
and their university issued a press release 
reporting what the authors called a “tanta- 
lizing” sign of y-rays coming froma recently 
found dwarf-galaxy companion to the Milky 
Way that is allegedly rich in dark matter’. 
The y-ray signal, found in images from the 
Fermi y-ray satellite’s Large Area Telescope 
(LAT), seemed to be consistent with high- 
energy radiation produced when particles 
of dark matter annihilate. But the photon 
excess of only 3-4 times the noise level was 
inconclusive, as the authors acknowledged. 

Another paper posted on arXiv the same 
day disfavoured the discovery. A more com- 
prehensive re-analysis of the same data by 
the Fermi-LAT instrument team’ — using 
updated software 30-40% more sensitive 
— recorded no signal beyond noise. The 
authors of the first paper acknowledged 
that the software upgrade was imminent and 
would confirm or refute their claim, but did 
not wait for it. 

Detecting a noise fluctuation is nothing 
new, but the possibility that the ‘detection’ 
might have been dark matter meant that 
it was widely reported in the media. Even 
balanced reporting raises the issue in the 
public’s mind; the account in The New York 
Times‘ mentioned the non-detection, but the 
hint of excitement drove the story. 

Second, the greater use of public data sets 
increases the risk that some researchers will 
make spurious detections near the edge of an 
instrument's sensitivity. More brains may be 
picked to mine the data. But analysis is dif- 
ficult without inside information from those 
who built and calibrated the instrument. 

That was the case with the Fermi-LAT 
dark-matter detection. The released Fermi- 
LAT data — public since 2009 — are the 
product of complicated algorithms and 
calibrations that turn the electronic signals 
of detectors into quantities that any physi- 
cist can in principle analyse. The instrument 
builders, however, have the know-how to 
push the noise limits down. 

The risk that someone will misuse data 
also grows when more people have access to 
them. Even the largest collaborations cannot 
police discoveries made by outsiders using 
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their data. Even if they internally re-run the 
analysis, the damage is done once an errone- 
ous result has been made public. 

Third, many more papers are now released 
on preprint servers such as arXiv (which had 
about 100,000 submissions in 2014), and 
press releases are sent out before peer review. 
Competition for positions, funding, career 
metrics such as the h-index and prizes drives 


the rush to publish 
“Metrics need prematurely and 
to be devised publicize results. 
that distinguish Incorrect papers 
citations of posted on arXiv do 
discredited more than add to 
claims.” the noise of irrel- 


evant results. Fund- 
ing decisions are skewed; theorists waste a lot 
of time trying to devise explanations; and the 
public is misled through news reports. 

A striking example of a premature claim 
released online before peer review was the 
report last year of evidence for gravitational 
waves and cosmic inflation — the Universe's 
rapid expansion in the instant after the Big 
Bang — by the BICEP2 microwave telescope 
at the South Pole. The detection of a swirled 
polarization pattern, known asa B-mode, in 
the cosmic microwave background (radia- 
tion left over from the Big Bang) was not in 
doubt — it had a 7-sigma signal’. But its sup- 
posed cosmic origin turned out to be false. 
It was shown six months later — with data 
from the European Space Agency’s Planck 
satellite — to be warm dust in the Milky 
Way’. Again, I believe that the authors of 
the original paper must have known of the 
impending Planck data but chose to blow 
their trumpet ahead of confirmation. Even- 
tually, the BICEP2 and Planck collaborations 
worked together to arrive at a solid result’, an 
approach that should have been considered 
from the beginning. 


QUALITY CONTROL 

To avoid further weakening of scientific 
standards and reputations, researchers need 
to stick to scientific best practice. 

A first step would be for physicists to make 
sure that they apply the 5-sigma rule (or an 
equivalent) for firm discoveries. Online 
posting should not be elided with publica- 
tion. It is premature to announce an impor- 
tant finding to the public at the same time as 
it is announced to scientific peers. Critical 
examination by peers is necessary — not 
least to avoid personal biases. 

As long as online posting is confused with 
the release of deeply scrutinized results, 
quality assurance of preprints posted online 
should become stricter. An ‘endorsement 
system, whereby users must be endorsed 
by other users before posting a paper, has 
been developed by arXiv to ensure that non- 
scientific pieces are not hosted there. More 
is needed for extraordinary claims. Named 
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reviewers for major discoveries would 
reassure the readers and authors, as well 
as crediting the reviewers. Journals should 
discourage the referencing of arXiv papers. 

Instrument builders and specialists who 
collected the original data should review 
major claims that are based on public data 
sets, either as referees, advisers or collabo- 
rators. Other teams with ancillary data that 
could refute or prove a claim should be 
involved in checking major results before 
release. This will require voluntary good 
conduct by competitors, which again could 
be encouraged by naming reviewers on 
breakthrough papers. 

A system needs to be established to reward 
best practice. Collaborations should estab- 
lish a way to ensure that a data team work- 
ing with an individual scientist will not 
competitively sink the scientist’s publication 
nor diminish their visibility. Internal review 
should precede announcements of major 
results at conferences. Policies should be 
devised for author lists to give proper credit. 

Journals and arXiv should find a strategy 
for allocating credit to the lead scientists 
in such collaborations. The BICEP2 team, 
for example, did work with the Planck col- 
laboration later; had they been able to mark 
their priority better they might have delayed 
a press conference. 

Not surprisingly, the original BICEP2 
paper has ten times more citations than the 
final word; many incorrect papers are more 
highly cited than counter cases. Academic 
metrics need to be devised that distinguish 
citations of discredited claims so that it is 
not more advantageous to state and retract 
a result than to make a solid discovery. 

Physicists’ associations (such as the 
American Physical Society or the Interna- 
tional Union of Pure and Applied Physics) 
should lead a movement akin to the biology 
community’s reproducibility initiatives. 
Scientists, publishers and representatives of 
funding agencies must convene to discuss 
improvements to norms such as peer review, 
metrics, use of databases, quality assurance 
and codes of conduct. m 


Jan Conrad is professor of astroparticle 
physics at the Oskar Klein Centre for 
Cosmoparticle Physics, Stockholm 
University, Stockholm, Sweden. 
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Cities such as Lahore in Pakistan can have traffic jams that last for hours. 


Six research routes to 
Steer transport policy 


Strategies must better balance the costs and benefits of travel and be realistic 
about the promises of new technologies, say Eric Bruun and Moshe Givoni. 


costs of travel. Although technology 

enables us to get around faster than 
ever, many cities are gridlocked — Sao Paulo 
in Brazil frequently experiences eight-hour 
traffic jams. More than 90% of the 1.2 million 
traffic deaths each year worldwide occur in 
developing countries and half involve pedes- 
trians, cyclists and motorcyclists. Premature 
death from vehicle-related fine-particle air 
pollution worldwide is predicted to rise by 
50% by 2030. In rich countries, sedentary life- 
styles and obesity are in large part the result of 
our love affair with the car. In poor countries, 
people may spend two hours walking to work 
to avoid a modest bus fare. 

Transport research is central to twenty- 
first-century global challenges that include 
energy provision, climate change and health. 
Yet the field is stuck. The language is chang- 
ing — ‘transport’ has become ‘mobility’ — 
and sustainability is more often mentioned 


== is not balancing the benefits and 


in research papers and policy documents. 
But most planners are still hopelessly trying 
to fight congestion, and most researchers 
and policy-makers put too much faith in 
technological solutions. 

Reframing mobility research to answer 
the following six questions will inform better 
transport policies. 


SIX QUESTIONS 
What are the long-term impacts of new 
technologies? Although the excitement 
associated with a new product, service 
or tool is often justified, the negative, 
unintended impacts must be anticipated. 
Take the driverless car. Depending on 
whom one asks, such cars will be in wide 
use in some countries by 2025 or 2050. They 
are framed as a technology that offers cheap 
mobility while saving time and energy’. But 
it was exactly this thinking that brought 
us the ‘with-driver’ private car and its 
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unsustainable consequences. 

The driverless car promises to be even 
more successful. Getting people out of their 
driverless cars will be even harder. 

On average, people around the world 
spend an hour a day travelling, a pattern that 
has held for centuries and across cultures. 
When we are able to eat, sleep and work in 
our driverless cars, this time will become 
longer, creating a burst of urban sprawl with 
its associated increases in energy consump- 
tion and adverse impacts on the land. 

The stakes are too high to believe the 
promises of new mobility technologies 
without extensive research that goes beyond 
the technical, regulatory and commercial. 
Researchers and policy-makers need to 
treat any significant technological change 
as a ‘socio-technical’ change that alters daily 
practices and functioning. Protection of 
personal privacy will be a particular chal- 
lenge. On the basis of impact assessments, 
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governments may need to discourage certain 
new technologies or encourage their utili- 
zation in a particular way. For example, 
driverless vehicles hold great potential for 
public transportation. 


How should the impacts of transport 
systems be evaluated? Economic cost- 
benefit analysis is an increasingly contro- 
versial method for assessing investment and 
policy decisions. Transport affects so many 
aspects of life, particularly in urban areas, 
over such a long time that a monetary focus 
alone cannot do the issue justice. Analysis of 
multiple criteria offers some improvement 
on cost-benefit, but it is unrealistic to expect 
to capture all impacts in one score. 

Who benefits and who doesn’t needs to be 
accounted for. For example, building motor- 
ways through US cities in the 1960s divided 
low-income black and minority ethnic 
communities while enabling ‘white flight’ 
to the suburbs. Moreover, under the current 
accounting system, future generations lose 
out as the discounting of costs and benefits 
in the future encourages the consumption of 
non-renewable resources now. 

The value placed on travel time needs to 
be reconsidered. It can be a waste and viewed 
as a monetary cost, but with wireless tech- 
nology or a good book it can also be pro- 
ductive and fun — even more so when the 
driverless car arrives. 

Travel behaviour models — used to 
project future demand — are crucial for any 
evaluation. Increasingly sophisticated mod- 
els that are largely based on random utility 
theory from mathematical psychology have 
been developed over the past 30 years to bet- 
ter capture reality. But they are so complex 
and expensive that most cities cannot afford 
them or collect the data required. Results 
that can be comprehended only by the mod- 
ellers are not transparent enough to support 
democratic decision-making. 

Researchers must come up with new 


New York City is introducing more bike lanes. 
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evaluation methods that are robust and 
scientifically defensible. The outputs must 
be comprehensible to elected officials and 
to the public. Such methods must include 
both quantitative and qualitative benefits 
and costs, and capture a much larger array of 
them. For example, researchers might con- 
sider whether the public perceives that the 
comfort and beauty of their city will improve 
or deteriorate after a major investment. 

A good example is the @resund Eco- 
Mobility project, a joint effort between Swed- 
ish and Danish 
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It is combining 

cost-benefit analysis with risk assessment and 
qualitative impacts of interest to the popula- 
tion, determined at a ‘decision conference at 
which planners and public representatives 
can see and hear differences of opinion. 


How does the structure of cities affect 
sustainability, living standards and 
functioning costs? Studies of urban den- 
sity, energy consumption and travel statistics 
show that altering the form of cities to reduce 
greenhouse-gas emissions is as effective as 
improving technologies and substituting 
fuels, yet it receives much less attention’. 
Energy consumption per capita by private 
vehicles declines with higher urban density, 
for instance (see “Transport trends’)’. 

Total expenditures (public and private) on 
passenger transport decrease as urban den- 
sity increases. Yet zoning and infrastructure 
investment decisions are not based on 
broader scientific analyses of the impacts. 
Understanding the drivers of sprawl is of 
vital importance for fast-growing cities such 
as Mexico City, Delhi or Lahore in Pakistan 
that are swallowing up adjacent farmland 
and wetlands. 

Researchers need to consider more fully 
urban transport alongside other features of 
the built environment. We know, for example, 
that taller buildings and smaller areas used 
per person for a given type of activity (resi- 
dential, commercial or recreational) tend to 
be associated with compact and energy-effi- 
cient cities, such as Hong Kong. Yet planners 
lack accurate models for land use develop- 
ment that consider many design variables. 

Scientists and planners urgently need to 
understand the significance of changes in 
urban development plans on the construc- 
tion and operating costs of all aspects of the 
built environment, on total urban energy 
consumption, on living standards and on 
space consumption. To accomplish this, 


© 2015 Macmillan Publishers Limited. All rights reserved 


universities and urban governments need 
to break down traditional borders between 
disciplines and professional responsibilities. 


How can mobility beyond cities be 
improved? Transport links beyond cities 
are important for regional development. The 
exodus of people from the countryside to 
cities in search of employment is intensify- 
ing urbanization worldwide and increasing 
pressure on scarce resources. Urbanization 
scholars highlight the rapid growth of meg- 
acities, but smaller cities and villages must 
also be considered. 

Researchers need to model how better 
connections — local, regional and between 
a city and its hinterlands — might improve 
the prospects of smaller towns and rural 
areas. Studies need to be more inclusive, 
politically neutral and regionally equitable. 
Transport investments often favour large 
cities and their links with expensive, fast 
transport options. The opportunity costs 
— what a government could have done with 
the money — are not considered sufficiently. 

For example, the UK government plans to 
spend more than £40 billion (US$63 billion) 
on High Speed 2, a 400-kilometre-per-hour 
rail link between London and Birmingham 
and (later) Manchester. In our view, the 
money would be better spent on improving 
the country’s entire public-transport net- 
work, which is poor by European standards, 
and on local and regional transport, which 
is currently dominated by private-car use’. 
Joining up many small cities could ben- 
efit the national economy and society: for 
every 12 jobs created in cities in the south of 
England between 2004 and 2013, only one 
was created elsewhere in Britain. Many other 
countries and regions face similar decisions. 


How could transport be improved in 
developing countries? Researchers need to 
assess and suggest ways to establish rapid, 
cheap and effective transport systems in 
poor nations. Elaborate physical redesigns 
of infrastructure, similar to those made by 
high-income countries, take too much time 
and money to implement. Instead, develop- 
ing countries should learn from developed 
countries’ planning mistakes. They could 
also ‘leapfrog’ to the latest technologies. 

For example, in Nairobi in 2013, student 
bus passengers were issued with smartphones 
that allowed city planners and research- 
ers to track their routes, count riders and 
identify areas of congestion (see go.nature. 
com/ihhyét). An app that contacts a traffic 
signal to let a bus through an intersection 
can aid mobility flow. Private-sector invest- 
ment in such systems is low because of the 
lack of commercial prospects for what seems 
like simple technology. Public-sector fund- 
ing of such applied research and collabora- 
tions between universities from high- and 
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TRANSPORT TRENDS 


Sprawling metropolises such as Atlanta, Georgia, have higher energy consumption per capita because their residents are highly dependent 
on cars for long journeys, whereas cities such as London and Hong Kong have higher building densities and more public transport. 
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low-income countries should be encouraged. 

But technology can get in the way. For 
example, farmers in South Asia experience 
impassable roads during the rainy or mon- 
soon season. Money would be better spent 
on gravelling or paving farm tracks than on 
widening major roads linking large cities. 
Similarly, investing in standard bus services 
that have been allowed to wither, rather than 
in more sophisticated high-capacity rail and 
bus mass-transit systems that mimic those 
in developed countries, will be more benefi- 
cial. It will be cheaper, realized sooner, car- 
ries lower risk, and importantly the benefits 
will be wider spread across the city and its 
inhabitants. 


What kinds of governance work for the 
transport system? Services such as Uber 
— the taxi smartphone app that connects 
passengers with drivers in dozens of cities 
worldwide — and variants of car sharing 
have caught the transport planning sector 
by surprise; institutions are not sure whether 
to support or fight these advances. Like any 
innovation they are a great opportunity but 
also carry risks. Even with shared cars, it is 
physically impossible for large cities to meet 
everyone’ travel needs with what is essentially 
a variation of single-occupant vehicles. When 
finding that parking a car is not an issue any 
more, people may flock again to cars, revers- 
ing the accessibility, sustainability and livabil- 
ity trends experienced in many cities such as 
New York, which is promoting public trans- 
port, cycling and walking as modes of transit. 
So, good public transit will be needed more 
than ever to compete with the car. 
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Research is needed to understand the 
policy implications of rapid changes in 
transport technologies and systems and 
how institutions should evolve to accom- 
modate them*. Methodologies and tools are 
needed for devising effective policies, com- 
bining them strategically and overcoming 
implementation barriers such as public and 
political acceptability. 

One methodology that could be expanded 
is ‘policy packaging’, in whicha combination 
of instruments is implemented while steps 
are taken to minimize unintended effects 
and increase the chances of an intervention’s 
success. For example, London's congestion- 
charge scheme — which charges car users 
a fee to enter the city centre during certain 
hours — was accompanied by improvements 
to public transport and heavy discounts for 
residents within the zone’. Such policy pack- 
ages have been suggested for promoting car- 
sharing services in European cities (see, for 
example, www.spreeproject.com). 


FRESH THINKING 

Governments should support system-level 
research that is needed by the public sector 
yet attracts scant funding from the private 
sector. The majority of research money for 
transport currently goes to technological 
development with commercial potential — 
such as the driverless car — which already 
receives private funding. 

Universities and governments need to 
realign research incentives to support the 
interdisciplinary scholarship needed. This 
includes stable funding and centres that can 
attract and nurture a variety of talent and 
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cross-border collaboration, especially where 
there is lack of commercial potential but 
great promise to society. If not, researchers 
will remain in narrow specialities in which 
funding and publishing are safer. 

Our transport systems, as well as our 
cities, must be planned for people — not for 
a particular mode of transport or by a hand- 
ful of companies with vast lobbying power. 
Delivering low-carbon mobility for all will 
take fresh thinking*. m 
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Wind turbines are low-density sources of power. 


Profiles of power 


Arnulf Grubler examines a study of power output and 
spatial area — a key concept in discussing renewables. 


aclav Smil is a prolific, sometimes 
\ | controversial, but invariably thought- 
provoking author. His latest book 
centres on a simple but important concept. 
Power density — defined in the bookas “ener- 
gy’s rate of flow [transfer] per unit of surface 
area’ of land or water — matters because these 
densities differ vastly for different methods 
of energy generation and use. That difference 
needs to be reconciled: massive, costly energy 
infrastructures such as long-range transport 
and storage are involved. 

Megacities, with their concentrations 
of high-rise buildings, demand enormous 
quantities of energy in a comparatively small 
amount of space at any one time — and so 
have a high density of power demand. But 
diffuse renewable energy sources such as 
sunlight or biomass have low energy yields 
per hectare and intermittent availability, and 
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so are low-density power supplies. Hence, in 
any transition towards renewables, cities will 
require vast renewable-energy hinterlands 
along with extensive cropland for food. 

Smil excels when discussing the historical 
context of how our economy, cities and indus- 
try rose by harnessing high-power-density 
fossil fuels. He is also good at explaining the 
fundamentals that underlie the low power 
densities of renewables, such as the low effi- 
ciency by which plants convert the radiant 
energy of sunlight into chemical energy in the 
form of biomass (0.5%, Smil shows). Photos 
illustrate the energy options well. 

However, Smil’s technical discussion of 
options from renewables such as solar and 
wind to fossil fuels such as coal and gas, with 
its many site-specific examples, is necessarily 
repetitive and thus a tedious read. Moreover, 
he fails to clarify some fundamental concepts 
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or to differentiate 
appropriately between 
different qualitative 
aspects of power and 
area. 

Most fundamen- 
tally, he does not 
explain the concepts 
of power and energy 
per se; he seems to 


Power Density: 


A Key to presume that readers 
Understanding are familiar with them. 
Energy Sources And it is unclear why 
and Uses 

VACLAV SMIL. he focuses on power 


density rather than 
energy density, or does 
not combine the two (given that energy = 
power x time). Imagine a stream of water run- 
ning at 1 litre per second. The total amount of 
water (86,400 litres a day) is the energy; the 
flow at any one moment is the power (1 litre). 
Ifyou need more than the available flow, your 
power demand exceeds the power supply and 
cannot be met. If you dam and store the flow, 
then suddenly release it, you vastly increase 
the power; but you cannot increase the total 
energy. 

In my view, the primary constraint is the 
total usable energy flow. The rate at which 
energy can be transferred (power) acts as a 
secondary constraint, particularly for renew- 
ables, because of their low energy density and 
intermittent availability. Conversely, a high 
power density does not necessarily translate 
into a high energy density. To illustrate: light- 
ning has an electric power equivalent to the 
output of ten large nuclear power stations. 
But because it lasts for just a fraction of a sec- 
ond, it only provides enough energy to power 
an electric car for less than 1 kilometre. 

The book also creates confusion because 
many of Smil’s examples of power density 
are actually energy densities. He is aware that 
power density as a measure should not focus 
on brief bursts of extreme energy availabil- 
ity, such as the solar electricity generated at 
noon in southern Arizona; it should be aver- 
aged over a larger territory and longer peri- 
ods. And he appropriately calculates average, 
representative power densities, expressed as 
watts per square metre (Wm). The standard 
unit for power, the watt, is defined as joules 
per second. But Smil’s power densities are 
averages calculated over an entire year, and 
so become energy densities (in watt-years per 
year), an equivalence he fails to explain. How- 
ever, he is not alone in this confusing use of 
power units that actually denote energy flows. 
Climate scientists also use Wm ~ to describe 
the radiative balance of the planet and its 
alteration by increas- 
ing concentrations of 
greenhouse gases. 

There are also 
important qualitative 
aspects that need to be 
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considered. It matters whether power 
refers to electricity or biomass. And it 
matters whether land is used exclusively 
or only partly for energy. Smil raises this 
issue but does not address it systemi- 
cally. He could have used the concept of 
exergy (energy’s ability to perform use- 
ful work) to differentiate between high- 
quality energy (such as electricity, which 
is versatile) and low-quality energy (such 
as straw, which demands costly conver- 
sions to become usable beyond burn- 
ing). However, his comparisons look 
only at power densities, irrespective of 
quality. Smil sketches out a valuable tax- 
onomy of energy-related land uses along 
two dimensions. The first is exclusivity: 
the site of a power plant, for instance, is 
unusable for agriculture; right-of-way 
land underneath transmission lines is 
not. The second is longevity of use: a 
nuclear-waste repository will be in 
place for centuries, whereas an annual 
crop such as maize (corn) for conversion 
to ethanol can be grown in rotation. Yet 

Smil does not use 


“Cities will this taxonomy, and 
require vast as a result some- 
renewable- ES COMP Anes 
energy apples and oranges. 


Unlike in some 
of Smil’s other 
books, the produc- 
tion quality of Power Density is regret- 
tably low. I prefer readable graphics 
accompanied consistently by source 
and data referencing. In this book, many 
graph labels are hardly legible, and fig- 
ures with references are the exception. 
Sometimes sources are mentioned in 
the text but not in the caption; at others, 
they are not even in the text. Graphs plot 
data on population and energy use, but 
statistical data sources are not specified 
or referenced. 

Power Density’s detailed examina- 
tion of the spatial constraints of energy 
options adds to Smil’s earlier, pioneering 
treatment of the subject, making it useful 
for energy specialists interested in explor- 
ing a massive ramp-up of renewables. But 
its technical nature and language make it 
rather inaccessible to a wider audience. 
And its failure to explain fundamental 
concepts such as the difference between 
power and energy, and to provide ade- 
quate data and source referencing, make 
it unsuitable as a textbook. m 
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humankind revisited 


Michael Cherry catches up with new developments and 
old dilemmas at South Africa’s hominin-fossil hotspot. 


after half an hour’s drive 

from Johannesburg, 
through the Gauteng 
Highveld of South Africa. 
This open, grassy space 
scattered with trees 
is a World Heritage 
Site, riddled with 
limestone caves 
and hominin fossils. 
A substantial chunk of the evidence that 
Africa is the wellspring of humanity was dis- 
covered here; and with anthropologist Lee 
Berger of the University of Witwatersrand 
(Wits) in Johannesburg set to unearth more 
at the Malapa and Rising Star Cave sites, the 
Cradle still rocks. 

Asa heritage site in a developing coun- 
try, the area is a focus for national pride. But 
developments there are spurring questions 
over which part of the nation they serve. 

Palaeontologists will rejoice over the 
launch, on 21 July, of a state-of-the-art vault 
to house star local finds, an adjunct to Wits’s 
Centre of Excellence for Palaeosciences. The 
vault will allow specimens to be compared 
with other finds, both hominin and non- 
hominin, from around Africa. These include 
the Taung skull (Australopithecus africanus), 
dated to between 2 million and 3 million 
years ago, which was discovered north of 
Kimberley in 1924; specimens of Austra- 
lopithecus sediba discovered at Malapa, 
including a remarkably complete skeleton 
called MH1, as well as casts of East African 
discoveries such as Lucy (Australopithecus 
afarensis) and the type specimen of Homo 
habilis, found by anthropologists Mary and 
Louis Leakey. The vault’s laboratory has a 
micro-CT scanner and 3D printing facilities. 
But it is strictly for researchers’ use. 

What is there for the public? The 
Maropeng visitor centre opened a dec- 
ade ago as an interpretation centre for the 
Sterkfontein Caves, site of the discovery 
of the “Mrs Ples’ fossil (Australopithecus 
africanus) in 1947 and, 50 years later, Little 
Foot, the most complete early-hominin skel- 
eton known, which is as-yet undescribed. 
And late last year, a light, moveable, steel 
structure known as the Beetle was placed 
over the Malapa site to let the paying public 
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Australopithecus sediba was 
discovered at Malapa Cave 
in South Africa. 


view excavations, once 
they resume at the site. 
(Digging has been on 
hold since 2009, when the 
remains of four A. sediba 
individuals were removed.) 
The Beetle protects the delicate lime- 
stone system from rain, and lets wild animals 
move freely below. Standing on eight clavi- 
cle-like supports, it has a fabric roof that col- 
lects rainwater and channels it to a sanitation 
system. Visitors will watch excavations from 
a raised circular walkway. A pulley below 
the platform is attached to a hoist capable of 
bearing a tonne of rock. 

But there are questions over how ‘public’ 
the Beetle actually is. Costing half a million 
US dollars — paid for largely through the 
National Research Foundation (using taxpay- 
ers’ money), as well as Wits and the Gauteng 
provincial government — the site is in a pri- 
vate game reserve and the tourists, when they 
come, will probably be rich. The Maropeng 
centre demonstrates this. Built at a cost of 
US$29 million, it charges $13 for admission 
(around half that for students), which prices 
out many in a country where one-fifth of the 
people still live on $28 a month. That could 
be reflected in Maropeng’s visitor numbers. 
Planned to accommodate | million visitors a 
year, it receives between 230,000 and 250,000 
and runs at an annual loss, picked up by the 
provincial government. Kruger National 
Park, by contrast, charges different rates for 
South Africans and foreign tourists, and 
receives 1.4 million visitors annually. 

Remarkable fossils continue to emerge in 
the Cradle, and it presents no less remark- 
able opportunities for palaeotourism. But a 
way must be found to make the specimens 
widely accessible. In situ interpretation of 
australopithecine remains should present 
a uniquely uplifting experience for all, rich 
and poor. = 
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IMMUNOLOGY 


Magic bullets to blockbusters 


Marian Turner delves into a history of the rapid rise of monoclonal antibodies. 


university, I made a monoclonal anti- 

body. After weeks of waiting for the mice 
that I had immunized with Acanthamoeba 
parasites to mount a robust immune 
response, I fused some of their spleen cells 
to cells derived from a mouse myeloma can- 
cer. The aim: to generate immortal hybri- 
doma cells, from which I could purify an 
endless stream of specific antibodies that 
bound to the parasites. 

The protocol I used was a barely refined 
version of one described by biologists César 
Milstein and Gorges Kohler in the United 
Kingdom almost three decades before. 
Hybridoma technology spawned the broad- 
reaching field of monoclonal antibodies, 
which is celebrated by historian of medicine 
Lara Marks in The Lock and Key of Medicine. 
Essential reagents in any cell-biology labo- 
ratory, monoclonal antibodies are now also 
common in medicine, from pregnancy testing 
to blood typing and disease diagnostics. 

Marks begins by summarizing scien- 
tists’ early attempts to understand protec- 
tive immunity and vaccination. The term 
magic bullets, coined by German physician 
Paul Ehrlich (who died 100 years ago next 
month) in his 1897 description of antibodies, 
is held up as an early beacon of researchers’ 
hopes for antibodies in medicine. As Marks 
shows, the hypotheses of pioneering scien- 
tists, amazingly, have often only just missed 
the mark. Ehrlich’s description ofa cell pro- 
ducing ‘side chains’ that break off as anti- 
bodies in response to encountering foreign 
substances is remarkably close to our current 
knowledge of surface immunoglobulins and 
secreted antibodies. The book’s early chapters 
also chronicle how physicians of the 1920s 
and 1930s developed serum-based therapies 
decades before anyone understood the basic 
mechanisms of antibody action. 

There were high hopes for the revenue 
potential of monoclonal antibodies, but 
not everyone was convinced at first. Marks 
describes how the National Research and 
Development Corporation in the United 
Kingdom initially failed to see the practical 
applications. As a result, the original hybri- 
doma technology was not patented. In 1979 
and 1980, US scientists won patents for essen- 
tially equivalent technology using myeloma 
cells provided by Milstein — a development 
that left many in Britain chagrined, including 
then-prime minister Margaret Thatcher. 

Such early hiccups were soon replaced 
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The fundamental protocol for making monoclonal 
antibodies has changed little in 40 years. 


by focused commercialization, leading to 
an exponential rise in patents and several 
intellectual-property battles. Although some 
scientists have found this unpalatable, anti- 
body-related royalties have been ploughed 
back into medical research — by 2012, the 
UK Medical Research Council's patents alone 
had earned £486 million (US$770 million). 
The possible ways of modifying antibod- 
ies have surged in the 40 years since Milstein 
and Kohler published their hybridoma pro- 
tocol. Marks discusses the generation of anti- 
body fragments, chimaeric antibodies and 
‘humanized’ antibodies — products from 
non-human cells that 
have been modified 
to reduce unwanted 
immune responses. 
The unfolding anti- 
body story paralleled 
advances in recombi- 
nant-DNA technol- 
ogy and transgenic 
animal models, which 


made such alterations The Lock and 
possible. Yet itismind- Key of Medicine: 
Monoclonal 


boggling that the first 
humanized antibodies 
were made before the 
advent of DNA-ampli- 
fication technology. 
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‘Humanized’ also describes what Marks 
has done for the antibody story. The Lock and 
Key of Medicine presents rich details of who 
at which institute collaborated with whom 
on which scientific advance or commerciali- 
zation process, and when. The meticulous 
accounts sometimes blur, but they convey 
a deep sense of the cumulative thought and 
effort embodied in today’s antibody technol- 
ogies, and remind us that interdisciplinary 
and international collaborations are not new. 
Readers in the field will appreciate the atten- 
tion paid to defining episodes, such as the 
HLDA workshops starting in 1982, which 
resulted in classification and verification sys- 
tems that brought coherence to the expand- 
ing catalogue of monoclonal antibodies. 

In the closing chapters, Marks describes 
some of the monoclonals that have become 
blockbuster drugs: rituximab, infliximab and 
trastuzumab. The stories of these antibodies 
reflect how serendipitous clinical outcomes, 
such as cancer drugs successfully treating 
arthritis, have led to deeper understanding 
of the biology of both classes of disease. 

There are some surprising gaps. There is 
no mention of the catastrophic 2006 phase I 
clinical trial of the monoclonal antibody 
TGN1412, manufactured by TeGenero 
to treat cancer and autoimmune diseases. 
The drug induced multiple organ failure in 
six healthy volunteers, caused by an unan- 
ticipated extreme immune reaction called a 
cytokine storm. The episode led to a revision 
of European guidelines for first-in-human 
trials. Also missing is a discussion of mono- 
clonal antibodies that block the immune 
regulatory proteins PD-1 or CTLA-4, which 
are arguably the hottest up-and-coming 
agents in cancer therapy today, or of the use 
of HIV-specific monoclonal antibodies for 
treatment and vaccine design. 

But it is a fast-paced field. The history of 
monoclonal antibodies ricochets between 
basic science, the clinic and the commercial 
world, and The Lock and Key of Medicine 
documents how lessons from one sphere 
have repeatedly led to advances in another. 
Marks concludes by pointing out that mono- 
clonal antibodies have received less fanfare 
than their biotechnological peers, genetic 
engineering and stem cells. Her thorough 
telling of this rich history goes some way 
towards restoring the balance. m 
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Taxonomic glory 
easier on eBay? 


One of zoology’s highest honours 
may now, it seems, be purchased 
on eBay (see go.nature.com/ 
ziq152). For a few thousand 
dollars, you are offered the 
privilege of naming a ‘small, rare’ 
species. A species name will last 
forever, says the vendor — even as 
taxonomists themselves struggle 
to survive. 

Taxonomists invest months 
confirming that a specimen 
is new to science. They sift 
through obscure literature — 
often in a different language and 
lamentably illustrated. More 
months are spent on the species’ 
description, which must be 
accurate enough to enable future 
taxonomists (should they survive 
the sixth mass extinction) to 
confirm that their ‘new’ species is 
different. Eventually, they publish 
their work in a systematics 
journal with an impact factor 
typically below 2 — even when 
the species is a previously 
undescribed mammal (the 
olinguito Bassaricyon neblina, 
pictured; see K. M. Helgen et al. 
ZooKeys 324, 1-83; 2013). 

These low impact factors make 
it hard for taxonomists to land 
positions in academia, and job 
opportunities in museums are 
sparse. Selling perpetuity on eBay 
is starting to look like an attractive 
alternative. 

Giovanni Strona European 
Commission, Joint Research 
Centre, Institute for Environment 
and Sustainability, Ispra, Italy. 
giovanni.strona@jrc.ec.europa.eu 


Europe needs Ebola 
outbreak consortium 


The European Commission (EC) 
has been criticized for failing to 
define specific research pathways 
for tackling the recent outbreak of 
Ebola virus in West Africa (J. M. 
Martin-Moreno et al. Lancet 384, 
1259; 2014). In our view, three 
changes would improve research 
into new interventions. 

First, the EC needs to set up 


a cooperative framework for 
implementing research activity 
during outbreaks. Experience 
indicates that existing networks 
have suboptimal capability to 
involve African institutions and 
local authorities in the research 
that could help to contain the 
epidemic (S. Laniniet al. Lancet 
Infect. Dis. 15, 738-745; 2015). 

Second, the EC must put in 
place resources and infrastructure 
so that, in the event ofan 
unexpected resurgence of the 
virus, intervention studies can be 
rapidly approved. 

Third, research institutions 
need to cooperate more closely 
with one another. Research 
networks have already been set 
up (see go.nature.com/9xtemn) 
and others have successfully 
supported fragile local health- 
care services. These include the 
European Mobile Laboratory 
Project, Quality Assurance 
Exercises and Networking on the 
Detection of Highly Infectious 
Pathogens, and the network of 
biosafety level 4 laboratories, 
Euronet P4. 

We suggest that an inclusive 
and committed European 
Consortium should be 
established. This should carry 
out research between epidemics 
and promptly translate the 
results into actions during 
epidemics. In our view, this 
consortium would be most 
effective if it were self-sufficient. 
Crucially, its budget structure 


would allow easy access to funds 
for implementing interventions. 
Alimuddin Zumla University 
College London, London, UK. 
David Heymann Chatham 
House Centre on Global Health 
Security, London, UK. 
Giuseppe Ippolito National 
institute for Infectious Diseases, 
Rome, Italy. 
giuseppe.ippolito@inmi.it 


Animal studies must 
be useful, says public 


The European Commission 

(EC) responded last month to 
‘Stop Vivisection, a European 
Citizens Initiative to phase out 
animal testing, which was signed 
by more than one million people. 
The EC confirmed that it will not 
replace the existing directive on 
the protection of animals used for 
scientific purposes (2010/63/EU), 
which already matches the level of 
protection in countries with the 
most demanding legislation. Yet 
the citizens’ principal argument 
relates not to animal suffering, but 
to the limited usefulness of results 
from animal models. 

Citizens concerned about 
animal welfare may still accept 
research that is perceived as being 
of ultimate benefit to humans, 
but only if it delivers relevant 
results — a view that evidently 
helped to mobilize signatures in 
this case. This is one of only three 
citizens’ initiatives since 2012 that 


have gathered enough signatures 
to reach the EC, so the scientific 
community needs to take its 
criticisms seriously. 

As long as blinding, 
randomization and appropriate 
sample sizes are not standard 
practices in animal research, 
claims of maximizing its benefits 
are not credible. Unreliable data 
from poorly designed studies and 
publication bias give misleading 
results on the therapeutic value 
of candidate drugs, leading to 
disappointing clinical trials (see, 
for example, S. Perrin Nature 
507, 423-425; 2014). Researchers 
must aim to do research that 
stands up to critical scrutiny from 
all quarters. 

I. Anna S. Olsson, Nuno H. 
Franco Institute for Molecular 
and Cell Biology, Porto, Portugal. 
olsson@ibmc.up.pt 


A prescient view of 
women in evolution 


The remarkable nineteenth- 
century German biologist August 
Weismann (Nature 522, 31-32; 
2015) also took a prescient stand 
in the discourse on the role of 
women in evolution. 

Weismann challenged a 
popular theory of heredity 
proposed by US zoologist 
William K. Brooks in The Law 
of Heredity (Murphy, 1883). On 
the basis of the Lamarckian idea 
ofan inheritance of acquired 
characteristics, Brooks argued 
that the ‘hereditary force} or 
Vererbungskraft (Weismann’s 
translation), was stronger in 
men than in women, writing that 
“something within the animal 
compels the male to lead and the 
female to follow in the evolution 
of new breeds”. Weismann 
roundly refuted this idea, 
pointing out that children inherit 
as many characteristics from their 
mothers as from their fathers (A. 
Weismann Die Bedeutung der 
Sexuellen Fortpflanzung fiir die 
Selections-Theorie; Fischer, 1886). 
U. Kutschera Institute of Biology, 
University of Kassel, Germany. 
kut@uni-kassel.de 
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Hallucigenia’s head 


The finding of pharyngeal teeth and circumoral mouthparts in fossils of the Cambrian lobopodian animal Hallucigenia 
sparsa improves our understanding of the deep evolutionary links between moulting animals. SEE LETTER P.75 


XIAQYA MA 


ossils provide direct evidence of 
f evolutionary history, and their unique 

morphological combinations can reveal 
crucial evolutionary links between extant taxa’. 
Most major animal phyla first appear in the 
fossil record during the Cambrian period, 541 
million to 485 million years ago, and this early 
flowering of animal life has been termed the 
Cambrian explosion. Therefore, Cambrian fos- 
sils are particularly important for understand- 
ing the origin and early evolution of major 
animal groups. In this issue, Smith and Caron’ 
(page 75) redescribe one of the most celebrated 
Cambrian animals, Hallucigenia sparsa, and 
document several new features of this species, 
including its pharyngeal teeth and circumoral 
elements, which are suggested to be two of 
the few morphological characters uniting all 
groups within the Ecdysozoa. 

The Ecdysozoa is far and away the richest 
animal group’. It is composed of eight extant 
phyla that shed their cuticle periodically to 
accommodate growth* — nematode worms 
and crustaceans are familiar examples. The 
two commonly recognized subgroupings 
of ecdysozoans, Cycloneuralia and Pan- 
arthropoda, have distinctly different body 
plans (Fig. 1). Cycloneuralia unites worm- 
like organisms (Nematoda, Nematomorpha, 
Priapulida, Kinorhyncha and Loricifera) that 
have a non-segmented body terminating in a 
mouth that can turn inside out (eversible) and 
has a ring of nerves behind it — their brain. 
By contrast, panarthropods (Arthropoda, 
Onychophora and Tardigrada) are all seg- 
mented, with paired legs, and have a dorsal 
(upper side) brain in front of the mouth. These 
great morphological disparities have made it 
difficult to illuminate the last common ances- 
tor of the Ecdysozoa and to fully understand 
the evolutionary relationships between its 
phyla, particularly between Cycloneuralia and 
Panarthropoda. Early ecdysozoan fossils are 
crucial for addressing these questions. 

Among the earliest Cambrian fossils, ecdyso- 
zoans are the most diverse and abundant group. 
They are best shown in exceptionally preserved 
Cambrian fossil localities, such as the Cheng- 
jiang biota in China (around 518 million years 
old) and the Burgess Shale in Canada (about 
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Figure 1 | The Ecdysozoa. This large animal group comprises eight extant phyla and two informal 
extinct groups, the lobopodians and the radiodontans. According to their body plans, ecdysozoans 
are commonly recognized as two distinct subgroups, Cycloneuralia and Panarthropoda. The fossil 
record of both can be traced to the earliest Cambrian period, and some members, such as priapulids 
and arthropods, have changed little over 500 million years of evolution. Cambrian lobopodians and 
radiodontans represent crucial evolutionary links between Cycloneuralia and Panarthropoda. Smith 
and Caron’ present an analysis of the lobopodian Hallucigenia sparsa that includes details of its head 


structures. (Illustrations by David Baines.) 


508 million years old). The body plans of some 
of the organisms represented have not changed 
much over 500 million years of evolution, such 
as priapulids (commonly known as penis 
worms) and arthropods (jointed-legged inver- 
tebrates with an exoskeleton and a segmented 
body, such as insects and spiders). Other Cam- 
brian groups are extinct but represent crucial 
evolutionary stages, such as lobopodians (an 
informal group of worm-like animals with 
non-jointed legs) and radiodontans (a group 
of animals characterized by possessing a pair 
of frontal appendages at the anterior part of 
the head and a ventral (lower side) mouth sur- 
rounded by radial tooth plates). 

Cambrian lobopodians are assigned to 
Panarthropoda on the basis of their segmented 
body and paired legs, but they also share a 
worm-shaped soft body and a terminal mouth 
with cycloneuralians. These unusual character 
combinations make Cambrian lobopodians 
particularly relevant for understanding the 
evolutionary links between the two major 
ecdysozoan groups. 
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Hallucigenia sparsa from the Burgess Shale 
is certainly the most famous Cambrian lobo- 
podian animal. It was originally reconstructed 
upside down’ and considered to be one of the 
most bizarre Cambrian creatures until it was 
recognized as a lobopodian animal armed 
with dorsal spines’. However, owing to lack 
of evidence of clear head structures, the front 
and rear ends of H. sparsa have been a subject 
of debate. Smith and Caron’s redescription 
includes a new set of anatomical features that 
once and for all clarifies the anterior—posterior 
orientation of H. sparsa. They show that the 
animal had an elongated head with a pair of 
dorsal eyes. It also had hardened, lamellae-like 
structures surrounding its mouth opening 
(circumoral elements), and the front part of 
its foregut (its pharynx) was lined with teeth. 

Although pharyngeal teeth and circumoral 
mouthparts have been reported in other 
Cambrian lobopodians”*, Smith and Caron 
have provided the most convincing evidence 
yet of equivalent structures in this extinct 
group. These findings amplify the transitional 


PLASMACHEM 


status of Cambrian lobopodians, because the 
pharyngeal teeth of H. sparsa most closely 
resemble those of Cambrian priapulids, 
whereas circumoral structures are also a key 
characteristic of Cambrian radiodontans. 
More crucially, H. sparsa is now regarded” 
as an ancestor of living onychophorans (com- 
monly known as velvet worms), so the find- 
ing of H. sparsa mouthparts suggests that the 
absence of circumoral elements and pharyn- 
geal teeth in extant onychophorans is probably 
the result of secondary loss. Thus, this com- 
bined structure is now reported for all major 
ecdysozoan groups. 

Smith and Caron further notice the simi- 
larities of ecdysozoan mouthparts (see 
Supplementary Note 1, transformation series 9 
and 13 of the paper’), and suggest that all phar- 
yngeal teeth and circumoral structures across 
ecdysozoan groups share a single origin from 
the last common ancestor of ecdysozoans. This 
provides new anatomical features to unite the 
Cycloneuralia and Panarthropoda. 


NANOTECHNOLOGY 


However, this conclusion is bound to 
provoke some controversy. Limited by the 
vagaries of preservation, it is difficult to 
determine the detailed morphology and sym- 
metry of the pharyngeal teeth and circumoral 
elements of H. sparsa — such details are essen- 
tial for further comparative studies. Although 
the homology of ecdysozoan pharynxes lined 
with teeth is well accepted, the evolutionary 
links between the circumoral structures of 
Cycloneuralia and Panarthropoda are less 
clear, because these differ substantially in 
their structure, relative position, construction 
and symmetry’. Therefore, a more complete 
understanding of the evolutionary origin and 
transformation sequence of these mouth- 
parts depends on a more thorough compari- 
son of their morphology, development and 
innervation across all ecdysozoan groups. 
For this, new fossil evidence showing tran- 
sitional features of the mouthparts between 
cycloneuralians and panarthropods would be 
particularly enlightening. = 


Colourful particles 
for spectrometry 


Asmartphone camera, patterned with arrays of filters made from colloidal 
suspensions of coloured particles, has been transformed into a powerful tool for 


spectral analysis. SEE LETTER P.67 


NORM C. ANHEIER 


attended lecture at the Royal Institution 
of Great Britain, during which he pre- 
sented his pioneering experimental work on 
the interaction of light with matter’. Faraday’s 
study probed the fundamental properties of 
light related to its reflection and absorption 
by progressively smaller particles. During 
the presentation, very fine gold particles dis- 
persed in liquid were shown to produce vivid 
colours not seen with larger particles. Faraday 
did not know that he had created suspensions 
of particles now known as colloidal quantum 
dots (CQDs), but, guided by insight, he con- 
cluded that the distinct colours were due to the 
minute sizes of the gold grains. On page 67 of 
this issue, Bao and Bawendi’ describe how 
they have exploited the unique optical prop- 
erties of CQDs to develop a compact optical 
spectrometer that could be integrated with a 
smartphone camera or used as a miniature, 
hand-held sensing tool. 
Faraday had glimpsed a special condi- 
tion that allows a particle's quantum nature 
to be expressed. His work set the course for 


1E 1857, Michael Faraday gave a well- 


nanoscience and quantum theory, but it 
took 125 years before the physics of the phe- 
nomenon that he observed was attributed to 
quantum size effects’. It is now known that, 
when CQDs are exposed to light, some of 
the electrons in these particles are excited as 
they gain energy from the photons. However, 
unlike large particles and bulk materials, the 
nanoscale dimensions of the quantum-dot 
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particles confine the electrons and change 
the energy difference between their excited 
and relaxed states. CQDs emit light when 
the electrons relax from a higher to a lower 
energy state (Fig. 1). The colour of the light 
depends on the states’ energy difference 
and is critically linked to the size of the par- 
ticles, which can be controlled when pro- 
ducing the CQDs. The physics that underpins 
this behaviour allows CQDs to be used for 
spectroscopy. 

The first simple spectrometer, consisting 
ofa dispersive prism, was developed by Isaac 
Newton, who proved that white light is com- 
posed ofa spectrum of many colours*. These 
days, optical spectrometers have become indis- 
pensable instruments used to measure the dis- 
tribution of light’s colours (wavelengths) ina 
variety of complex scientific investigations. 
Astronomers use them to collect and analyse 
optical spectra of exoplanets that may have life- 
supporting atmospheres’. Planetary scientists 
are using spectrometers on board rovers on the 


Figure 1 | Colloidal quantum dots. When they are excited by ultraviolet light (pictured), colloidal 
suspensions of minuscule particles (known as colloidal quantum dots, or CQDs) fluoresce at different 
colours depending on the particle size. Bao and Bawendi’ have exploited the unique optical absorptive 
properties of CQDs to develop a compact spectrometer that serves as a powerful tool to analyse the 
spectral characteristics of light. 
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surface of Mars to analyse the composition of 
soil and rocks, looking for clues to the planet’s 
past environment and whether conditions may 
have been favourable for microbial life®. Opti- 
cal spectrometers routinely support activities 
that underpin our daily lives, such as biomedi- 
cal research, drug discovery, renewable energy, 
forensic science, environmental monitoring 
and chemical detection. 

The optical spectrometers used in these 
applications tend to be complex and costly 
because of their numerous high-precision 
optical and mechanical components and 
the stringent requirements for the align- 
ment of these parts. They can also suffer 
from poor throughput, because much of 
the input light is scattered or absorbed as it 
passes through many components before 
reaching the detector for analysis. Finally, to 
be able to distinguish, or resolve, two nearby 
wavelengths, the instruments must typically 
be large. 

Bao and Bawendi have overcome many 
of these limitations through an elegant 
integration of nanotechnology with the 
image sensors used in digital cameras. Their 
spectrometer is based on a design involv- 
ing broadband absorptive filters, which are 
similar to the coatings applied to sunglasses 
to block ultraviolet light. The filters are made 
from a series of CQDs, each with a specific 
particle size. 

This design concept can be explained by 
considering the case of a single sensor and 
a broadband absorptive optical filter. The 
sensor detects visible light and the filter has 
a specific, known (measured independently) 
cut-off wavelength, below which light is 
totally absorbed. Ideally, the filter should 
efficiently transmit light above this cut-off 
wavelength. 

Now consider an additional sensor and an 
absorptive filter tuned to another, slightly dif- 
ferent, cut-off wavelength. Both pairs of sen- 
sors and filters are illuminated with light of 
unknown colour content. The difference in 
the signals registered by the two sensors is a 
measure of the incident light power between 
the different cut-off wavelengths. In principle, 
extending this approach to greater numbers of 
sensors with filters tuned to different cut-off 
wavelengths will increase the range of colours 
that can be measured and the ability to resolve 
two adjacent colours. 

Bao and Bawendi achieved this scaling by 
applying 195 different broadband absorp- 
tive CQD filters to hundreds of locations on 
the pixelated image sensor and by using an 
extended spectrum-reconstruction method 
to handle the large set of sensor readings. 

Commercial nanotechnology developments 
have led to simplified quantum-dot synthesis 
and precise control of dot size. CQDs can be 
used as tailorable broadband absorptive filters, 
because both the spectral absorption and 
the emission properties vary with particle 
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size. It is now feasible to produce CQDs with 
continuously and finely varying absorption 
cut-off colours, from the deep violet to near- 
infrared wavelengths. These solutions can be 
directly patterned on the image-sensor pixels 
using inkjet or direct-contact printing meth- 
ods. The long-term stability of the patterned 
quantum dots has also improved to provide 
reasonable device lifetime. It is these key fac- 
tors that enabled Bao and Bawendi to develop 
their CQD spectrometer, and the simplicity of 
their design overcomes the constraints usually 
seen in conventional spectrometers. 

Future developments in nanotechnology 
and their potential for commercialization may 
provide the full complement of CQD materi- 
als needed for spectral measurements beyond 
the visible range. One promising research area 
concerns chalcogenide CQDs, which have 
optical emission and absorption properties 
that extend to longer, infrared wavelengths’. 
Further technical challenges must be overcome 
to improve CQD materials and reduce optical 
losses. If practical automated quantum-dot 
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patterning on image sensors can be realized, 
then the costs that limit widespread integration 
of this technology into consumer electronics 
will be reduced. In the future, we may see tiny, 
high-resolution CQD spectrometers used on 
space missions or as ubiquitous sensing ele- 
ments in household devices connected to the 
Internet. m 
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The case for pay to quit 


Arandomized controlled trial of four financial- incentive programmes for 
smoking cessation finds that reward-based schemes lead to sustained abstinence, 
but low public acceptability of such schemes threatens their adoption. 


THERESA M. MARTEAU 
& ELEN] MANTZARI 


obacco remains the most lethal legal 

product’, killing up to half of all users” 

and accounting for more than half of 
the difference in life expectancy between rich 
and poor members of society’. Higher pricing, 
which equates to a financial penalty, is 
thought to be the most successful interven- 
tion for reducing smoking. A 50% increase in 
inflation-adjusted prices is estimated to cut 
consumption by 20% in high-, middle- and 
low-income countries, with the largest impacts 
on the young and on poor people’. For those 
wanting to quit, behavioural interventions 
and pharmacotherapy can help, although few 
attempts result in sustained quitting’. Financial 
incentives are a relatively new addition to the 
repertoire of behavioural interventions. Writ- 
ing in the New England Journal of Medicine, 
Halpern et al. present a trial of four incentive 
schemes that adds to the growing evidence 
base (Fig. 1). 

The trial is the first to compare four incen- 
tive schemes for smoking cessation against 
usual care. It involved 2,538 employees of a 
US company, their relatives and friends. Two 
of the schemes targeted individuals and two 
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targeted groups of six participants who were 
incentivized through the collective perfor- 
mance of the group. One individual and one 
group-based scheme provided rewards of 
US$800 for smoking cessation, and the other 
two schemes required refundable $150 depos- 
its, together with a $650 reward, for successful 
participants. 

Acceptance of the reward-based schemes 
— assessed as the proportion of participants 
enrolling in the scheme to which they were 
assigned — was 90%, much higher than for 
the deposit schemes, at 13.7%. There were 
no differences in the acceptance of individual 
and group-based schemes. When comparing 
the entire cohort of participants, regardless of 
whether or not they enrolled on the scheme 
they were offered (with those not enrolling 
in the scheme assumed to have remained 
smokers), the quit rates in the intervention 
groups at 6 months ranged from 9.4% to 
16% — all higher than the 6% quit rate in the 
usual-care group. Quitting was higher for the 
reward- than for the deposit-based schemes 
(15.7% versus 10.2%) and similar for individ- 
ual and group-based schemes. 

Among the small proportion of participants 
who accepted the deposit schemes, 52% had 
quit at 6 months, compared with 17% of those 
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Figure 1 | Carrots work for smokers. Halpern et al.° show that reward-based financial-incentive 
schemes are more commonly accepted by smokers than are deposit-based schemes, but that both 
approaches lead to higher rates of quitting than usual care. 


on reward schemes. This outcome led Halpern 
et al. to conclude that use of deposits is better. 
However, the lack of interest in these schemes 
limits their usefulness, as does the absence of 
sustained effects. At 12 months, 6 months after 
the incentives were stopped, about 50% of quit- 
ters were smoking again. Only those who had 
been enrolled in reward-based schemes, who 
retained quit rates of 7.5% for the individual 
and 8.7% for the group schemes, showed an 
advantage over the 3.4% quit rate at 12 months 
achieved through usual care. 

The problem of ‘gaming’ — faking being a 
smoker to qualify for enrolment on a scheme 
or being a non-smoker to remain on a scheme 
— is of particular concern when using 
financial incentives for smoking cessation. 
Although reports of quitting by participants 
in Halpern and colleagues trial were validated 
by checking cotinine levels (a component of 
tobacco) in their saliva, smoking at enrolment 
was confirmed in only a small minority of par- 
ticipants. An estimated 20% of those recruited 
were non-smokers, although this did not affect 
the study’s findings. By contrast, a study of 
pregnant smokers who were offered participa- 
tion in a financial-incentive scheme revealed 
that none of the 239 enrolled, all of whom 
were biochemically tested at baseline, were 
non-smokers’. This raises questions about the 
contextual features of incentive schemes that 
can be used to minimize gaming. 

Halpern and colleagues’ well-designed 
study, the largest known of its kind so far, 
makes a considerable contribution to the 
growing evidence about financial incentives 


for smoking cessation. Two recent systematic 
reviews*”, one of which® includes the results 
of this trial, suggest that financial-incentive 
schemes can be effective in achieving sustained 
quitting, particularly when substantial incen- 
tives are used and when offered to pregnant 
smokers. The effect of incentives is also dou- 
bled in more-deprived populations’, suggest- 
ing that this approach motivates greater change 
in people on low incomes, among whom rates 
of smoking are not only markedly higher but 
also resistant to reduction. This finding high- 
lights the potential contribution of financial 
incentives for reducing the health inequalities 
related to smoking, including the significantly 
higher death rates from tobacco among poor 
people. 

Were such schemes a pill or a behavioural 
intervention not involving money, they 
would doubtless be included in the range 
of interventions on offer to smokers want- 
ing to quit. We are, however, “funny about 
money’ and in particular about “money out 
of place””’. Given their equal effectiveness 
to other interventions, financial incentives 
are less acceptable to the public, and in turn 
potentially to health professionals and policy- 
makers, than other interventions for changing 
behaviours". 

Why is there not greater acceptability of 
a seemingly cost-effective intervention for 
reducing the leading cause of preventable 
premature death worldwide? The low accept- 
ability of paying people to stop smoking (or to 
lose weight or take medication) reflects sev- 
eral concerns about ideas of fairness, including 
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‘coercing the vulnerable; ‘rewarding the 
feckless’ and ‘not rewarding the responsible’ 
The first of these often arises in the context 
of incentivizing the disadvantaged, because it 
is assumed that such individuals are less free 
to resist monetary temptations. This com- 
promises the use of incentives for addressing 
health inequalities. An example of the effect 
of the latter two concerns is a situation in the 
United States’, in which objections from 
non-smoking employees led to the rejection 
by management of a $750 reward scheme for 
smoking cessation (for which there was strong 
evidence of effectiveness) and the adoption ofa 
$625 penalty deducted from smokers’ salaries 
— an intervention for which at the time there 
was no evidence of effectiveness. 

The acceptability of financial incentives can 
change: it increases with information about 
effectiveness and varies with incentive type 
(acceptability is lower for cash incentives and 
higher for voucher schemes)’*. Nonetheless, 
financial interventions to change behav- 
iour — whether they involve price increases 
through taxation or reward schemes — are 
much less acceptable than other, often less 
effective, interventions™. Studies are needed 
that explore the scale and duration of finan- 
cial-incentive schemes in different popula- 
tions of smokers, and that aim to identify 
the characteristics of schemes producing 
the greatest changes. In parallel with this, 
research is needed to develop ways of increas- 
ing the acceptability of interventions that 
could significantly help to reduce the death 
toll from tobacco, particularly among poor 
individuals. = 
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PLANETARY SCIENCE 


Sink holes and dust jets 
on comet 67P 


Analyses of images taken by the Rosetta spacecraft reveal the complex landscape 
of a comet in rich detail. Close-up views of the surface indicate that some dust 
jets are being emitted from active pits undergoing sublimation. SEE LETTER P.63 


PAUL WEISSMAN 


hen do 18 holes not make for a 
pleasant afternoon playing golf? 
When the 18 holes are located on 


the surface of a comet speeding through the 
Solar System. On page 63 of this issue, Vincent 
et al.' describe the holes, also called pits, that 
comprise one of the many discoveries of the 
European Space Agency’s Rosetta mission to 
comet 67P/Churyumov-Gerasimenko (67P). 
The Rosetta spacecraft went into orbit around 
67P in August 2014, and the surprises have 
been coming fast since then. Vincent et al. 
propose a mechanism for the formation of the 
pits and identify them as one of the sources of 
active dust jets. 

Comets are the most primitive bodies in 
the Solar System; they are the remnants of its 
formation process. Comets therefore retain 
a physical and chemical record of the condi- 
tions and materials in the solar nebula — the 
gas and dust cloud out of which the Sun and 
planets formed 4.56 billion years ago. Con- 
veniently, comets have spent most of that 


Figure 1 | The nucleus of comet 67P/Churyumov-Gerasimenko 

(67P). Vincent et al.' analysed images of comet 67P taken by the Optical, 
Spectroscopic and Infrared Remote Imaging System cameras on the Rosetta 
spacecraft. a, The complex nucleus topography includes large, flat-floored 
basins (indicated by white arrows). A large, circular pit is visible just above 
the centre of the image (red arrow). b, A string of pits dot the surface of the 
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time in two very cold storage locations: the 
Kuiper belt beyond the orbit of Neptune and 
the spherical Oort cloud outside the planetary 
region, stretching halfway to the nearest stars. 
The distant Oort cloud is the source of the 
long-period comets that have orbital periods 
ranging up to millions of years. The Kuiper belt 
is the source of the Jupiter-family comets, such 
as 67P, which typically have periods of less than 
20 years and orbital dynamics that are strongly 
affected by Jupiter. 

As acomet approaches the Sun and warms 
up, the central solid part, known as the 
cometary nucleus (comprised of volatile ices 
and primitive meteoritic material), begins to 
sublimate and becomes enveloped by a freely 
outflowing atmosphere called the coma. One 
of the first surprises for Rosetta, the first ever 
comet-rendezvous mission, was the odd 
shape of the target comet’s nucleus (Fig. 1a)’. 
Although some nuclei comprised of two large 
pieces and looking like a bowling pin had 
been observed before by fly-by missions to 
other comets, the two lobes of 67P sit on top 
of each other, with a narrow ‘neck in between. 


laser printer cartridge. 
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There is intense speculation as to how this 
odd configuration may have formed. Did two 
cometary nuclei gently collide randomly in the 
solar nebula, or is the nucleus a single piece 
that has been oddly sculpted by sublimation 
processes? Although the former is the more 
likely scenario, some scientists on the mission 
suspect the latter. 

Rosetta’s camera system, the Optical, 
Spectroscopic and Infrared Remote Imaging 
System (OSIRIS), is comprised of narrow- 
angle and wide-angle digital cameras. As 
the OSIRIS team of scientists” began to map 
the surface of the nucleus using the cameras, 
they discovered 18 pits on the surface, which 
Vincent et al. now describe more thoroughly. 
The cometary nucleus has a diameter of 
approximately 4 kilometres. The pits are typi- 
cally about 200 metres in diameter and about 
180 metres deep. Pit-like features have been 
observed on other cometary nuclei, but the 
morphology of the pits on 67P has not been 
seen before. They typically have cylindri- 
cal shapes with circular openings and near- 
vertical walls (although at least one pit seems to 
be lying at a steep angle). And some of the pits 
are clearly active: images of pits that are illu- 
minated by sunlight show dust jets emanating 
from their walls and/or floors (Fig. 1b). 

How did the pits form? Vincent et al. suggest 
that they are ‘sink holes, which formed when 
material near the surface of the nucleus col- 
lapsed into the low-density interior. Rosetta’s 
Radio Science Investigation team has found” 
that the nucleus has an average bulk density 
of only 470 +45 kilograms per cubic metre, 
about half the density of solid water ice. But 
the Grain Impact Analyser and Dust Accumu- 
lator instrument has measured’ a dust-to-ice 


} 
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cometary nucleus. In active pits such as these, bright jets of dust are seen 
being emitted from the sunlit walls. The contrast of this image has been 
enhanced to highlight the interiors of the pits and the jets. As a result, the 
cometary surface looks very bright, but in reality it reflects only about 6% of 
the incoming sunlight — roughly the same as the black toner particles ina 
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mass ratio of 4+ 2, suggesting that silicates 
and organics, rather than ices, make up about 
80% of the mass of the nucleus. This in turn 
implies that 75-85% of the nucleus interior 
is empty space, a parameter known as poros- 
ity. A high porosity is predicted by the lead- 
ing scenarios for the internal structure of 
cometary nuclei, which suggest that they are 
aggregates* of smaller, icy bodies that gently 
came together in the solar nebula. These aggre- 
gates are also referred to as rubble piles*. This 
concept has provided insights into the behav- 
iour of comets, such as random and other 
splitting events. 

The morphology of 67P’s surface is domi- 
nated in some areas by large, flat-floored 
basins, similar to features seen on the nucleus 
of comet’ Wild 2. It has been suggested that 
these are sublimation basins that slowly widen 
as the walls sublimate, leaving large, non- 
volatile particles that cover the basin floor. 
The basins cannot be impact craters because 
they have the wrong size distribution (there 
are too many large ones), and because not 
many impact craters are expected on a small 
cometary nucleus such as 67P. 

Could the pits described by Vincent et al. 
be the precursors of the basins, slowly wid- 
ening as their walls sublimate? Many of the 
pits found by OSIRIS are located in the same 
region on the nucleus where many of the large 
sublimation basins are found. Both comet 67P 
and comet Wild 2 are relatively young — that 
is, they have only recently (within the past 
60 years) been perturbed by the gravitational 
field of Jupiter to perihelion distances (the 
point in their orbit closest to the Sun) at which 
it is warm enough for water ice in the nucleus 
to sublimate, and at which the activity that 
manifests itself as the bright cometary coma 
and tails begins. If this is so, why are sublima- 
tion basins not observed on other, perhaps 
older, Jupiter-family comets such as Tempel 1 
and Hartley 2? Older nuclei may have accu- 
mulated thicker layers of non-volatile mater- 
ials that have buried the sublimation basins 
and substantially lowered the activity levels of 
those comets. 

Rosetta has already indicated that it has 
more surprises for us. On 13 June 2015, the 
orbiter began receiving signals from the Philae 
lander, which is on the surface of the comet 
nucleus and was last heard from in Novem- 
ber 2014. With its batteries recharging, Philae 
probably has much more information to trans- 
mit about its final landing location. Also, the 
activity of the nucleus is expected to reach a 
maximum soon after the comet passes through 
perihelion at 1.25 astronomical units from 
the Sun (a point about 25% farther from the 
Sun than Earth’s orbit) in mid-August 2015. 
Rosetta will then follow 67P away from the 
Sun as cometary activity begins to wane. What 
changes will we see on the nucleus surface? 
And how will this alien golf course look from 
Rosetta’s vantage point then? = 


Paul Weissman is at the Jet Propulsion 
Laboratory, NASA, Pasadena, California 
91109, USA. 

e-mail: weissman@jpl.nasa.gov 


1. Vincent, JB. et al. Nature 523, 63-66 (2015). 
2. Sierks, H. et al. Science 347, aaal044 (2015). 


EVOLUTION 


NEWS & VIEWS | RESEARCH | 


3. Rotundi, A. et al. Science 347, aaa3905 (2015). 

4. Donn, B. & Hughes, D. in 20th ESLAB Symp. 
Exploration of Halley’s Comet (eds Battrick, B. et al.) 
523-524 (ESA, 1986). 

5. Weissman, P. R. Nature 320, 242-244 (1986). 

6. Kirk, R. et al. in 46th Lunar and Planetary Science 
Conf. Abstr. 2244 (2015). 


Reptile sex determination 


goes wild 


Wild populations of an Australian lizard have sex chromosomes and also exhibit 
temperature-controlled sexual development, providing insight into how these 
two sex-determining mechanisms may evolve back and forth. SEE LETTER P.79 


JAMES J. BULL 


‘ T ertebrate sex determination is getting 
interesting. On page 79 of this issue, 
Holleley et al.' report elaborate field 
and laboratory studies on an egg-laying Aus- 
tralian lizard, Pogona vitticeps, and reveal that 
its sex is determined both by its complement 
of chromosomes and by the temperature at 
which its eggs are incubated. Earlier reports”? 
had hinted at the possibility of combined geno- 
typic and environmental sex determination in 
some lizards, but it had never before been con- 
vincingly reported in the wild. 

Fifty years ago, the dominant view, laid 
out by Susumo Ohno’, was that there was an 
inexorable evolutionary progression from 
genetically labile mechanisms of sex deter- 
mination to highly refined and differenti- 
ated sex chromosomes. Ohno suggested that 
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Figure 1 | Pogona vitticeps. Holleley et al.' show that, in this lizard species, embryos that have two Z 


this progression was recapitulated across the 
vertebrate groups, with sex chromosomes 
becoming increasingly more entrenched 
as one moved up what was then perceived 
as the evolutionary phylogenetic ladder 
— fish had labile systems, mammals and 
birds had entrenched sex-chromosome 
systems, and reptiles were in the middle of 
this transition. 

Within two decades, that view had radically 
changed’. By then, many reptiles were known 
to have full-blown sex-chromosome systems, 
whereas many others had sex determination by 
incubation temperature and no hint of inher- 
ited sex. It was also realized that the evolution 
of sex determination follows basic evolution- 
ary principles, and that chromosomal and 
environmental sex determination can both be 
highly functional, adaptive systems. In other 
words, they are not different steps along an 


chromosomes and are thus genetically male can develop as female at warm egg-incubation temperatures. 
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evolutionary progression, but are alternative 
states that could, in theory, evolve back and 
forth. At the time, however, such transitions 
were unknown, and it seemed that each reptile 
species had one type of system or the other, like 
two sides of a coin. 

This latter perception is now also changing, 
and it seems that we are seeing the coin as 
it flips. The lizards studied by Holleley et al. 
(Fig. 1) have visibly recognizable sex chromo- 
somes with female heterogamety — females 
have a Zand a W chromosome, and males have 
two Zs. However, the authors find that nearly 
20% of ZZ individuals sampled in the wild are 
female instead of male. Incubation of eggs in 
the laboratory revealed that the ZZ offspring 
develop as male at low temperatures but that 
an increasing proportion develop as female 
as the incubation temperature increases. So 
ZZ females in the wild probably come from 
warm nests. 

These observations cement previous specu- 
lation about how sex chromosomes and envi- 
ronmental sex determination may coexist and 
how the transition between them may occur. 
At first glance, two problems are created by a 
system that combines sex chromosomes and 
environmental sex determination. The first 
is that arbitrary environmental determina- 
tion of sex would lead to ZW males and ZW 
females, which when mated would sometimes 
lead to WW offspring. Ifthe W is a degenerate 
sex chromosome, meaning it has lost many of 
its functional genes, WW offspring would be 
inviable or sterile. This is solved in P. vitticeps 
by the simple fact that ZW is always female — 
only the ZZ genotype becomes either sex — so 
there is no possibility of aZW-ZW mating and 
thus no WW genotype. 

The second problem is that the consistent 
development of some ZZ individuals into 
females creates an excess of females in the 
population. This dilemma is solved through 
sex-ratio selection, which automatically 
adjusts the frequency of the W chromosome 
to progressively lower levels as more ZZ 
females are produced. Provided that the envi- 
ronment is neither too warm nor too cool, the 
equilibrium population may sit indefinitely at 
a point that includes some ZW females and 
some ZZ females. But there is a continuum of 
equilibria spanning from pure chromosomal 
sex determination to pure environmental sex 
determination as the average nest temperature 
increases, and it is a steep transition. Extended 
Data Figure 4 of Holleley and colleagues’ 
paper’ shows that the wild population sits on 
a virtual cliff of the changeover between the 
two mechanisms. 

The study augments this picture with sev- 
eral other observations. First, the tempera- 
tures causing ZZ lizards to become female 
are slightly lower for offspring of ZZ moth- 
ers than for offspring of ZW mothers. This 
result suggests underlying quantitative varia- 
tion in the propensity for environmental sex 
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determination, a satisfying confirmation of 
theory: those individuals most genetically dis- 
posed to develop as ZZ females have offspring 
that are also genetically disposed to become 
ZZ female. Second, ZZ females have markedly 
higher fecundity than ZW females. This result 
was not expected, and although it is not con- 
tradictory to theory, it raises the question of 
why. The answer may shed light on the selec- 
tive advantage of temperature-dependent sex 
determination. 

Holleley and colleagues’ findings will no 
doubt inspire parallel work on other species, 
especially in efforts to understand the transi- 
tions between sex-determining mechanisms 
and to explore the ecological and evolution- 
ary consequences of the different mechanisms. 
The ability to assess the fitness of ZZ and ZW 
females raised at the same temperature will 
enable comparisons to be made that are cru- 
cial to understanding the relative advantages 
of the two systems and the possible costs of 
sex-chromosome degeneration. 

Broader geographic and longitudinal com- 
parisons for these lizards will give insight into 
the ramifications of climate change on this 
temperature-dependent reproductive mode. 
However, the established equilibrium between 
genetically and environmentally determined 
sex in these lizards should respond quickly to 
climate change, because an overproduction of 
ZZ females in warm years would lead to a com- 
pensatory reduction in the frequency of ZW 
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females in the next generation and beyond. 

The findings in this one system should dove- 
tail with recent revelations of frequent changes 
in lizard heterogametic determination, a dis- 
covery made possible by easy genome sequenc- 
ing of less-studied species’. The accumulating 
information about the molecular bases of 
reptile sex determination’ will add greatly to 
this understanding, and may reveal interesting 
constraints imposed on the transitions®. The 
emerging picture is that reptilian sex determi- 
nation is more flexible on an evolutionary scale 
than could ever have been imagined. m 
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Inversion in the worm 


Combinations of spatially and temporally restricted transcription factors are 
shown to coordinate movement in nematode worms by controlling the formation 
of synaptic connections to and from motor neurons. SEE LETTER P.83 


VILAIWAN M. FERNANDES 
& CLAUDE DESPLAN 


ervous systems are staggeringly 
complex. To generate appropriate 
behavioural outputs, the countless 
synaptic connections that neurons form with 
other cells must be precisely regulated to 
ensure that they are arranged in the right cir- 
cuits at the right time. On page 83 of this issue, 
Howell et al.' show that, in the simple neuronal 
circuits of the nematode worm Caenorhabditis 
elegans, such spatio-temporal precision is 
achieved through transcription factors that 
function together to restrict the expression of 
a newly identified synaptic organizer protein 
called OIG-1. 
Nematode movement involves coordinated 
muscle contractions that are regulated by com- 
plex interactions between motor neurons on 
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the worms dorsal (upper) and ventral (lower) 
sides. Two classes of inhibitory D-type motor 
neuron’ in particular help to control this pro- 
cess. Dorsal D (DD) neurons make synaptic 
connections to muscles on the dorsal side of 
the worm, and are themselves innervated by 
excitatory motor neurons called cholinergic 
neurons from the animal’s ventral side. By 
contrast, ventral D (VD) neurons innervate 
ventral muscles and receive synaptic inputs 
from dorsal cholinergic neurons. Activa- 
tion of ventral cholinergic neurons therefore 
contracts ventral muscles and activates DD 
neurons, inhibiting dorsal-muscle contrac- 
tion, whereas activation of dorsal cholinergic 
neurons leads to contraction of dorsal, but not 
ventral, muscles. 

DD neurons form during embryonic devel- 
opment, but VD neurons arise after a larva’s 
exoskeleton has moulted for the first time’. 


a_ Early development 
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Figure 1 | Transcriptional regulation of synaptic specificity. a, During early 
development of nematodes, excitatory motor neurons (blue) on the dorsal 
side of the embryo make synaptic connections with, and so excite, dorsal 
muscle and inhibitory dorsal D-type (DD) motor neurons (red). DD neurons 
innervate, and thus inhibit contraction of, ventral muscle. Howell et al.’ report 
that the synaptic inputs and outputs of early DD neurons are controlled by 
LIN-14. This transcription factor, together with the ubiquitously expressed 
transcription factor UNC-30, promotes expression of the protein OIG-1, 


In fact, before VD neurons develop, early DD 
neurons innervate ventral muscles and receive 
synaptic inputs from motor neurons on the 
dorsal side of the animal’ (Fig. 1a). They later 
undergo a synaptic inversion when VD neu- 
rons arise (Fig. 1b). Howell et al. investigated 
the regulatory logic behind this synaptic rewir- 
ing using worms that harbour mutations in the 
gene unc-30, which encodes an evolutionarily 
conserved transcription factor, UNC-30. This 
protein is expressed in all D-type motor neu- 
rons throughout development, and is regarded 
as a ‘terminal-identity selector’ — its expres- 
sion ultimately defines whether neurons will 
become D-type motor neurons’ >. 

In addition to the uncoordinated locomo- 
tion for which the gene is named, the authors 
found that unc-30 mutant worms had defects 
in synaptic specificity. Although the cell bod- 
ies (the nucleus-containing regions) of D-type 
motor neurons are appropriately positioned in 
unc-30 mutants, the early DD and VD neurons 
fail to innervate ventral muscles and instead 
make abnormal synaptic connections to dorsal 
muscles. Moreover, VD neurons do not receive 
synaptic inputs from dorsal cholinergic motor 
neurons. 

How does UNC-30 regulate synapse forma- 
tion in specific spatio-temporal patterns, given 
its ubiquitous expression in D-type neurons 
throughout development? Previous studies*” 
have shown that mutations in two other genes 
encoding transcription factors, lin-14 and 
unc-55, recapitulate different aspects of the 
defects found in unc-30 mutants. Expression of 
LIN-14 is temporally restricted to early devel- 
opment, and mutation of lin-14 leads to the 
abnormal formation of dorsal synapses from 
early DD neurons*”. By contrast, UNC-55 is 
restricted to VD neurons, where it regulates 
synaptic specificity to ventral muscles”. 

Howell et al. reasoned that UNC-30 might 
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cooperate with LIN-14 and UNC-55 to pro- 
mote the expression of a molecule that blocks 
dorsal synaptic output. This molecule would 
be expected to be expressed under the control 
of UNC-30 acting with LIN-14 in early DD 
neurons, and under the control of UNC-30 
acting with UNC-55 in VD neurons, inhibit- 
ing the formation of synaptic outputs on the 
dorsal side of the animal. But under this simple 
model, ventral synaptic outputs would remain 
normal in unc-30 mutants. Because this is not 
the case, additional layers of regulation must 
be involved. 

OIG-1, a member of the immunoglobulin 
superfamily, whose members mediate inter- 
actions between cells, is a putative target 
of UNC-30, LIN-14 and UNC-55 (ref. 10). 
Howell and colleagues provide evidence that 
oig-1 is expressed in early DD neurons and is 
later restricted to VD neurons. Furthermore, 
its expression is altered by perturbations to 
lin-14 and unc-55 expression. The authors 
report that loss of oig-1 leads to the formation 
of dorsal synaptic outputs similar to those seen 
in unc-30, lin-14 and unc-55 mutants. More- 
over, synaptic innervation of early DD and 
VD neurons by cholinergic neurons on the 
dorsal side of the animal is disrupted in oig-1 
mutants. This suggests that OIG-1 coordinates 
both the inputs to and outputs of D-type motor 
neurons. 

Howell et al. found that OIG-1 is located 
along the ventral side of early DD and VD 
neurons. Given that the protein can organ- 
ize dorsal synaptic inputs and outputs, this 
observation suggests that it acts indirectly. 
The authors also show that forced expression 
of oig-1 could not block synaptic rewiring of 
late DD motor neurons to ventral muscles, 
indicating that other factors must cooperate 
with OIG-1 to regulate synaptic specificity. 
Identifying these cofactors, which, like oig-1, 


© 2015 Macmillan Publishers Limited. All rights reserved 


UNC-30 and UNC-55 —» OIG-1 
| 


Inhibitory VD 


which prevents the formation of synaptic outputs from DD neurons to dorsal 
muscle cells. b, After the larva's first moult, LIN-14 is no longer expressed 
and DD neurons undergo a synaptic inversion — they become innervated by 
ventral excitatory neurons and themselves innervate dorsal muscle. Ventral 
D-type (VD) motor neurons (purple) innervate ventral muscle cells and 
express the transcription factor UNC-55, which, together with UNC-30, 
promotes OIG-1 expression and prevents VD neurons from forming 
inhibitory connections to dorsal muscle. 


must be differentially expressed in early DD 
and VD neurons, will be of great interest, as 
will determining the molecular mechanisms 
by which they act with OIG-1 to regulate syn- 
aptic inputs and outputs through pre- and 
postsynaptic partner molecules. 

Achieving appropriate synaptic specificity 
involves many developmental steps that act 
together to ensure that neurons assume the 
correct identity. Determinants of neuronal 
identity, which are regulated by terminal- 
identity selectors, include the production of 
particular neurotransmitter molecules, the 
guidance of nerve fibres in specific direc- 
tions and the appropriate growth of branched 
projections called dendrites. Howell and col- 
leagues’ work demonstrates that the formation 
and targeting of synapses are also traits that 
give neurons a particular identity, and that 
synapses, too, can be regulated by terminal- 
identity selectors such as UNC-30. » 
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U4/U6.U5 tri-snRNP is a 1.5-megadalton pre-assembled spliceosomal complex comprising U5 small nuclear RNA 
(snRNA), extensively base-paired U4/U6 snRNAs and more than 30 proteins, including the key components Prp8, 
Brr2 and Snull4. The tri-snRNP combines with a precursor messenger RNA substrate bound to Ul and U2 small 
nuclear ribonucleoprotein particles (snRNPs), and transforms into a catalytically active spliceosome after extensive 
compositional and conformational changes triggered by unwinding of the U4 and U6 (U4/U6) snRNAs. Here we use 
cryo-electron microscopy single-particle reconstruction of Saccharomyces cerevisiae tri-snRNP at 5.9 A resolution to 
reveal the essentially complete organization of its RNA and protein components. The single-stranded region of U4 snRNA 
between its 3’ stem-loop and the U4/U6 snRNA stem I is loaded into the Brr2 helicase active site ready for unwinding. 
Snull4 and the amino-terminal domain of Prp8 position U5 snRNA to insert its loop I, which aligns the exons for splicing, 
into the Prp8 active site cavity. The structure provides crucial insights into the activation process and the active site of the 


spliceosome. 


The protein-coding sequences of most eukaryotic genes are inter- 
rupted by non-coding segments called introns. Introns are removed 
from precursor mRNAs (pre-mRNAs) and the flanking coding seg- 
ments (exons) are spliced together to form mRNAs by two successive 
trans-esterification reactions within a dynamic multi-megadalton 
protein-RNA complex known as the spliceosome. This complex 
comprises five canonical subunits, namely U1, U2, U4, U5 and U6 
snRNPs, and numerous non-snRNP factors!. Each snRNP contains 
an snRNA, seven Sm or LSm proteins, and a number of snRNP- 
specific proteins. During the initial stages of spliceosome assembly, 
U1 and U2 snRNPs recognize the pre-mRNA 5’ splice site and 
branch point, forming pre-spliceosomal A complex. The subsequent 
binding of the pre-assembled U4/U6.U5 tri-snRNP allows formation 
of the fully assembled spliceosomal B complex, which is converted to 
the catalytically active B* complex through extensive structural and 
compositional remodelling. During this process, U4 and U6 
snRNAs, which are extensively base-paired in tri-snRNP, are 
unwound, U1 and U4 snRNPs are released, and many new proteins 
join the spliceosome’’. This leads to the formation of a highly struc- 
tured RNA network between U2, U5 and U6 snRNAs and the 5’ 
splice site and branch point sequences in the pre-mRNA. The exten- 
sively base-paired U2-U6 snRNAs harbour catalytic magnesium 
ions* and position the branch point and 5’ splice site for the first 
trans-esterification reaction, which produces exon 1 and lariat 
intron-exon 2 intermediates. Further remodelling to C complex 
enables U5 snRNA loop I to align exons 1 and 2 for nucleophilic 
attack of exon 1 at the 3’ splice site, yielding spliced mRNA and lariat 
intron products”. Finally, the spliceosome is disassembled before the 
next round of splicing. 

U4/U6.U5 tri-snRNP is the largest pre-assembled spliceosomal 
complex, containing U5 snRNA, extensively base-paired U4/U6 
snRNAs and over 30 proteins”* (Extended Data Table 1). Three key 
proteins, Prp8, Brr2 and Snu114, have crucial roles in activation of the 
spliceosome and formation of the active site’. Prp8 forms crosslinks 


with 4-thiouridine introduced at key positions within U5 and U6 
snRNAs and the substrate pre-mRNA, showing that Prp8 is involved 
in substrate positioning and closely associated with the catalytic RNA 
core’®. Brr2 helicase, the activity of which is regulated by the GTPase 
Snul14 (refs 11-13), catalyses the unwinding of the U4/U6 snRNA 
duplex’*”*. Interactions between tri-snRNP proteins have been inves- 
tigated by yeast two-hybrid and in vitro binding assays'*’’. Electron 
cryo-microscopy (cryoEM) reconstruction of crosslinked human 
tri-snRNP at 21 A resolution revealed a tetrahedral overall shape with 
no clear domain separation’*®. Negative stain microscopy of cross- 
linked yeast tri-snRNP revealed a triangular shape with maximum 
dimension of 30-34 nm (ref. 19). The highly biased orientation of 
tri-snRNP on carbon films precluded full three-dimensional analysis 
while the projection structure revealed three extruding domains 
termed head, foot and arm; the arm domain adopts variable positions 
with respect to the rest. Some key proteins were localized within 
the projection structure using genetically introduced tags’. Brr2 
and U4/U6 snRNP were attributed to the head and arm domains, 
respectively. On the basis of this, it was proposed that Brr2 may 
engage with U4/U6 snRNAs for unwinding when Snul14—mapped 
in the hinge region—brings the arm and head domains closer”. 

The development of high-speed direct electron detectors*”’! and 
powerful maximum likelihood algorithms for classification and 
particle alignment” have made it possible to determine the structure 
of macromolecular assemblies at near-atomic resolution by cryoEM”’. 
By applying these new methods we obtained a map of native 
unstained yeast tri-snRNP at an overall resolution of 5.9 A in which 
protein a-helices and RNA double helices are readily discernible. This 
enabled us to fit the double-stranded helices of U5 snRNA and U4/U6 
snRNAs as well as previously determined crystal structures or homo- 
logy models of nearly all the proteins. The structure accounts for 
a wealth of biochemical and genetic data from yeast and human 
spliceosomes, and suggests a possible mechanism for B complex 
formation and the activation of the spliceosome. 


IMRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 OQH, UK. 


2 JULY 2015 | VOL 523 | NATURE | 47 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


U4/U6 snRNA 


US snRNA 


Snu114 


Prp6 


Snu114 . 


Figure 1 | Overview of the U4/U6.U5 tri-snRNP structure with its protein and RNA components modelled into cryo-EM density. a, Front view, facing 
concave surface; b, back view; c, top view. d, 2D class average showing the different domains of tri-snRNP: head, body, arm and foot. 


CryoEM of the tri-snRNP complex 


U4/U6.U5 tri-snRNP was purified from yeast by a gentle procedure 
without crosslinking, and the sample was subjected to cryoEM analyses 
(Methods and Extended Data Fig. 1). Using a combination of statistical 
classification and movie processing’*”* (Extended Data Fig. 2), we 
obtained a density map with an overall resolution of 5.9 A by the ‘gold 
standard’ Fourier shell correlation (FSC) = 0.143 criterion”® with local 
resolution ranging from 5.0 A to 20 A (Extended Data Fig. 3 and 4; 
Methods). The map revealed clear densities for double-stranded RNA, 
with protein helices appearing as long tubes and {-sheets as flat densi- 
ties (Supplementary Video 1). The density for the LSm proteins in the 
flexible arm domain became clearer after using a new multi-body 
refinement method (Methods and Extended Data Fig. 3). 


Overall structure 


Yeast U4/U6.U5S tri-snRNP has an overall Y-shape with a maximum 
dimension of approximately 300 A (Fig. 1). The large domain of Prp8 
(residues 885-1,824), consisting of the reverse transcriptase-like (RT), 
linker and type II endonuclease-like domains”®, is located near the 
centre of the assembly and its crystal structure was fitted into the map 
as a rigid body”® (Fig. 2 and Extended Data Fig. 5a). The orientation of 
the RNaseH-like domain with respect to the large domain is inverted 
in tri-snRNP as compared with the Prp8—Aar2 complex’ (Fig. 2d). 
Three segments of clear double-stranded RNA density extending 
from Prp8 to the foot domain are assigned to co-axially stacked stems 
I and II, variable stem-loop and stem III of U5 snRNA (Fig. 3; 
Extended Data Fig. 6) connected to the U5 Sm core (Extended Data 
Fig. 5e). Snu114 shows a significant sequence similarity to eukaryotic 
translation elongation factor 2 (EF2)''’° comprising domains I-V 
(Fig. 4 and Extended Data Figs 5c and 7). Homology models of each 
domain of Snul114 (residues 120-1,008) were fitted individually into 
the density adjacent to Prp8 and U5 snRNA, revealing a contact 
between domain III of Snul14 and the RT domain of Prp8 (Fig. 4). 
The structure of the N-terminal 884 residues of Prp8 is still unknown. 
The N-terminal helix of the RT domain of Prp8 (RTa1)'° extends 
further in tri-snRNP towards a bundle of four long helices. Another 
cluster of long helices, which makes close contact with the co-axially 
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stacked stems I and II of U5 snRNA, is found in the vicinity (Fig. 3b 
and Extended Data Fig. 6c). The region containing residues 420-542 
of Prp8 is known to interact with the N-terminal half of Snul14 
(ref. 27), and 4-thiouridine introduced at C79 of U5 snRNA crosslinks 
with both Prp8 and Snu114 (ref. 28), suggesting that the density adja- 
cent to Snul14 and U5 snRNA is part of the Prp8 N terminus. At the 
tip of stem I the density assigned to the U5 loop I extends towards 
the RT thumb/X domain of Prp8 and makes close contact with a 
thioredoxin-like fold of Dibl (Fig. 2c and Extended Data Fig. 5f) 


JM | 2413 


Jab1/MPN 
Endo 


N-term 


Figure 2 | Prp8 in tri-snRNP. a, Domain organization of Prp8. The structure 
of the N-terminal domain (residues 1-884) is unknown. RT, reverse 
transcriptase-like domain; X, thumb/X; L, linker; E, endonuclease-like domain; 
RH, RNaseH-like domain; JM, Jabl1/MPN domain. b, The large domain of 
Prp8 is located at the centre of tri-snRNP. The Jab1/MPN domain is bound to 
Brr2 (refs 31, 32). c, Loop I of U5 snRNA is inserted into the active site 
cavity and in contact with Dib1. d, Prp8 in the Prp8—Aar2 complex"® is shown 
with its large domain in the same orientation as in b. In tri-snRNP, the RNaseH 
domain is inverted while the Jab1/MPN domain in complex with Brr2 is 
located at the opposite end of the large domain. 
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(ref. 29). This is in good agreement with the fact that a 16-kDa protein 
is crosslinked to 4-thiouridine incorporated at U97 in the U5 snRNA 
loop I (ref. 28). The binding of Dib1 is further stabilized by the 
N-terminal helices of Prp8 (Fig. 2c). 

Brr2 forms a stable complex with the Jabl/MPN domain of 
Prp8 (refs 30-32), and its characteristic shape was recognized in the 
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Figure 3 | The snRNA components 
of U4/U6.U5 tri-snRNP. 

a, Secondary structures of U4/U6 and 
U5 snRNAs. b, Double-stranded 
regions of U4/U6 and U5 snRNAs 
modelled into the cryo-EM map. SL, 
stem-loop. 


U6 snRNA 


Figure 4 | Structure of Snul14 in tri- 
snRNP. a, Location of Snul114 in the 
U4/U6.U5 tri-snRNP. b, Arrangement of 
domains (I-V) in Snul114 (see Extended 
Data Fig. 7). c, Domain arrangement in 
EF-G bound to the ribosome”. d, The 
interface between the N-terminal domain 
of Prp8 and Snu114. Some of the 
uninterpreted density at the interface may 
be attributed to the unmodelled switch I 
loop. e, The interaction of the switch 
region of EF-G with the sarcin-ricin loop** 
for GTPase activation. 


GTP binding 
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a 


Domain IV 
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less-ordered head domain (Figs 1 and 5 and Extended Data Fig. 5b). 
Although this part of the map is lower in resolution, the individual 
domains of yeast Brr2 were fitted into the density together with the 
Jab1/MPN domain’. This revealed a widening of the gap between the 
two RecA domains of the N-terminal cassette (Fig. 5c). Co-axially 
stacked stems I and II of U4/U6 snRNAs and the 5’ stem-loop of 
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Figure 5 | Brr2 mode of unwinding. a, Domain organization of Brr2 
N-terminal helicase cassette (NHC). WH, winged helix; HLH, helix-loop- 
helix; FN3, fibronectin3-like domains. The inactive C-terminal helicase cassette 
(CHC) has the same domain organization. b, U4/U6 di-snRNP and its 
interaction with Brr2 in tri-snRNP. The domains of Brr2 NHC are coloured as 
in a. The single-stranded RNA between U4/U6 stem I and U4 snRNA 3’ stem— 
loop is already loaded in the active site of Brr2. When the Hel308 structure” 
is overlaid onto the NHC of Brr2, its 10-nucleotide DNA substrate coincides 
with the density in the Brr2 active site, which extends to U4 snRNA 3’ stem- 
loop (red dotted line). The helix-loop-helix domain of Brr2 interacts with 
U4 snRNA 3’ stem-loop (inset). c, Superposition of the RecA1 domain of Brr2 
in the crystal structure’ (Protein Data Bank (PDB) accession 4BGD, in grey) 
and in tri-snRNP (domains coloured as in a) shows the opening of the gap 
between the RecA1 and RecA2 domains (indicated by the red arrow) to 
accommodate the RNA substrate. 


U4 snRNA branching from the three-way junction are unambigu- 
ously identified near the N-terminal helicase cassette of Brr2 and 
Prp8 (Figs 1 and 3 and Extended Data Fig. 6). Snul3 and the Nop 
domain of Prp31 bind to the kink-turn at the tip of the U4 snRNA 
5’ stem-loop’. Initially the crystal structures of Snul3 and Prp31 
were fitted individually around the kink-turn but the density clearly 
showed that the coiled-coil domain of Prp31 is rotated by approxi- 
mately 60° with respect to the Nop domain in tri-snRNP (Fig. 1 and 
Extended Data Fig. 5g). The four-helix bundle of the coiled-coil 
domain is in contact with the RT domain of Prp8 (Fig. 1). 
Furthermore, Prp31 used for crystallization was a truncated form** 
and in our map clear a-helical density extends from both N and C 
termini (Extended Data Fig. 5g). The C-terminal helix extends from 
the 5’ stem-loop to the three-way junction, in agreement with pre- 
vious foot-printing experiments™ (Fig. la). Prp4 contains seven 
WD repeats at the C terminus” and its characteristic seven-bladed 
B-propeller domain is packed against Snu13 and the junction between 
the Nop and coiled-coil domain of Prp31 (Fig. 1 and 5 and Extended 
Data Fig. 5j). Four additional «-helices, probably belonging to the N 
terminus of Prp4, were built into the rod-like density on top of the 
B-propeller. Prp3 is predicted to have a ferredoxin-like domain at its 
C terminus* and interacts with Prp4 and U4/U6 stem II (ref. 37). 
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Density sandwiched between the Prp4 WD40 domain and U4/U6 
stem II, two long helices lying along U4/U6 stem II and a number 
of connected helices nearby probably belong to Prp3 (Fig. 1 and 
Extended Data Fig. 5i). The U4 core domain is wedged between the 
tandem helicase cassettes of Brr2 (Figs 1a, c and 5b and Extended 
Data Fig. 6a). The 3’ stem-loop of U4 snRNA contacts the helix- 
loop-helix domain of the N-terminal helicase cassette (Fig. 5b), which 
contains several lysine/arginine residues close to the RNA backbone 
of the 3’ stem-loop of U4 snRNA. On the basis of the previous 
labelling data’, U6 LSm proteins are fitted into the flexible arm 
region in the multi-body refined map (Fig. la and Extended Data 
Figs 3c and 5l). 

A striking elongated curved a-solenoid density bridging the 
RNaseH-like domain of Prp8 and the WD40 domain of Prp4 is 
assigned to the tetratricopeptide (TPR) motifs of Prp6 (Fig. 1b, c and 
Extended Data Fig. 5d). Prp6 is required for the accumulation of 
tri-snRNP* and is proposed to act as a bridge between U5 and 
U4/U6 snRNPs'*”’. Prp6 contains up to 19 predicted TPR motifs, each 
comprising a helix-loop-helix motif and 37 connected idealized poly- 
alanine helices were built into the map. Nine canonical tandem TPR 
motifs at the C terminus of the protein form a highly curved o-helical 
solenoid-like structure, which contacts Snul3, U4 snRNA 5’stem-loop 
and the Prp4 WD40 domain in tri-snRNP (Fig. 1b, c). This is consistent 
with the fact that antibodies against the C-terminal fragment of human 
Prp6 immunoprecipitate U5 snRNP but not tri-snRNP, as the 
C-terminal domain in our structure is in close contact with U4/U6 
snRNP, which presumably occludes the epitope”. 


Central role of Prp8 in tri-snRNP assembly 


Our single-particle cryoEM reconstruction of yeast U4/U6.U5 
tri-snRNP has revealed a nearly complete organization of its RNA 
and protein components, although some densities remain unassigned, 
and Snu66, Snu23, Prp38 and possibly sub-stoichiometric Spp381 are 
yet to be located (Extended Data Table 1 and Extended Data Figs 4f 
and 8d, e). Prp8 positioned at the centre of the assembly functions as a 
hub of protein-protein and protein-RNA interactions, holding the 
whole assembly together (Figs 1 and 2b). In yeast, a stable Prp8- 
Snul14—Aar2-U5 core domain complex is imported into the nucleus*®, 
where Brr2 replaces Aar2. The Jabl1/MPN and RNaseH-like domains, 
held tightly onto the Prp8 large domain by Aar2 (Fig. 2d), are released 
in tri-snRNP wherein the Jabl1/MPN domain forms a stable complex 
with Brr2 as in the crystal structure’’”* (Figs 1b and 2b and Extended 
Data Fig. 5b). The tri-snRNP structure provides the first glimpse of the 
interaction between Snul14, the U5 core domain and the N-terminal 
domain of Prp8 which holds the co-axially stacked stems I and II, and 
variable stem-loop of U5 snRNA (Fig. 3). On the opposite side of Prp8, 
Snu13 and Prp31 firmly bound to 5’ stem-loop of U4 snRNA”, the U4 
Sm protein assembly, the Brr2-Jabl1/MPN domain complex, the Prp3- 
Prp4 complex, the RNaseH domain of Prp8, and the U4/U6 snRNA 
duplex assemble together (Figs 1 and 2). 

Prp8 has a surface on which are exposed the 5’ splice-site-binding 
ACAGAGA sequence of U6 snRNA, the U6 sequences which pair 
with U2 snRNA, and U5 snRNA loop I which interacts with exon 1 
and exon 2. This surface is partly occluded by a highly conserved 
protein, Dib1 (Figs 2c and 6), suggesting its potential role in regulating 
the incorporation of RNA components into the active site cavity 
during spliceosome assembly and activation. When U4 and U6 
snRNAs are unwound, releasing U4 snRNA together with Snu13, 
Prp31, Prp3 and Prp4 (ref. 3) from the spliceosome, Brr2 stably bound 
to the Jabl1/MPN domain of Prp8 is no longer held in place and could 
be repositioned during catalysis and spliceosome disassembly (Fig. 5 
and Extended Data Fig. 8d). 


Brr2 mode of action during activation 


The unwinding of the U4/U6 snRNA duplex is an essential step in 
spliceosomal activation and is catalysed by Brr2 (refs 14, 15). Like 
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Figure 6 | Insights into activation mechanism and the active site of the 
spliceosome. a, Mapping of the U4-cs1 suppressor mutations on the surface of 
Prp8. Three clusters of mutations are found in close proximity to the key 
elements of spliceosomal activation: Prp31/U4 snRNA 5’ stem-loop, Snu114 
and ACAGAGA box of the U6 snRNA. b, A model of the catalytic core of group 
II intron docked into the active site cavity by superposition of the EBS1 stem 
of group II intron (PDB 3IGI) and stem I of the U5 snRNA. 


other Ski2-like helicases, Brr2 unwinds any RNA duplex with 3’ over- 
hangs*’. The U4/U6 snRNA duplex has 3’ overhangs on both ends 
(Fig. 3a and Extended Data Fig. 6a) and it has been suggested that Brr2 
binds to the single-stranded region of U4 snRNA and translocates 
along U4 snRNA*™?. Our structure shows that Brr2 is pre-loaded 
onto the single-stranded region between U4 snRNA 3’ stem-loop 
and stem I of the U4/U6 duplex, showing definitively that it translo- 
cates along U4 snRNA. The gap between the two RecA domains is 
widened in tri-snRNP, and the prominent separator B-hairpin is 
located adjacent to stem I of the U4/U6 snRNA duplex (Fig. 5b, c). 
Our purified U4/U6.U5 tri-snRNP disintegrates upon addition of 
ATP regardless of the presence of GIP or GDP but remains intact 
after addition of ADP or AMPPNP, a non-hydrolysable ATP ana- 
logue (Extended Data Fig. 8). This shows that Brr2 is in an active state 
in our purified U4/U6.U5 tri-snRNP, in perfect agreement with the 
structure. In vitro the RNaseH domain binds to the forked region of 
the U4/U6 snRNA duplex adjacent to stem I and inhibits Brr2 binding 
to the substrate RNA“. In tri-snRNP the RNaseH domain fails to 
prevent substrate loading onto Brr2 or unwinding. The helix—loop- 
helix domain of the Brr2 N-terminal helicase cassette interacts with 
the 3’ stem-loop of U4 snRNA, and this interaction may be important 
for positioning U4 snRNA in the Brr2 active site*’**. Because of the 
limited resolution we cannot model the single-stranded region of 
U4 snRNA de novo. However, when we superpose the N-terminal 
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helicase cassette of Brr2 onto the Hel308 structure with partially 
unwound DNA duplex*’, ten nucleotides of the substrate DNA coin- 
cide with the extra density in the Brr2 active site and six additional 
nucleotides can be accommodated in the density extending further to 
the 5’ end of the U4 snRNA 3’ stem-loop (Fig. 5b). 


Role of Snul14 in spliceosome activation 


Snull4 shows substantial sequence similarity to EF2 (Fig. 4 and 
Extended Data Fig. 7), suggesting that it might induce conformational 
change in the spliceosome upon GTP binding or hydrolysis and reg- 
ulate spliceosomal activation'’”’’. EF-G, the bacterial counterpart of 
EF2, enters the ribosome in the GTP-bound form. Its GTPase is 
activated when switch regions I and II are remodelled upon interact- 
ing with the sarcin-ricin loop of 23S rRNA“ (Fig. 4e) and GTP 
hydrolysis leads to translocation*. The activation process of the 
spliceosome has not been dissected in detail and it is not known at 
what stage GTP is hydrolysed or how Snul14 GTPase is activated. 
Snull4 and EF2 share highly similar switch I and II sequences, 
including the critical His residue, which in EF-G places a water mole- 
cule adjacent to the y-phosphate. Unassigned density connecting the 
junction between stems I and II of U5 snRNA and the switch I and II 
loops coincides with the position of the sarcin-ricin loop (Fig. 4d). 
This is likely to be the N-terminal domain of Prp8, which may have a 
role in the activation of GTP hydrolysis. 

Before the unwinding of the U4/U6 duplex, the 5’ splice site 
sequence pairs with the ACAGAGA sequence in U6 snRNA’. The 
U4-cs1 cold-sensitive mutation, which extends U4/U6 stem I at 
the restrictive temperature and sequesters the ACAGAGA box from 
the 5’ splice site, stalls the spliceosome before unwinding** (Extended 
Data Fig. 6a). A suppressor of U4-cs1 has a duplication of the ACAGA 
sequence in U6 snRNA”. This shows that pairing of the ACAGAGA 
sequence with the 5’ splice site is a checkpoint to ensure proper 
assembly of complex B before the unwinding of the U4/U6 snRNA 
duplex. Notably, suppressors of U4-cs1 in Prp8 form three clusters on 
the surface of the large domain of Prp8 (Fig. 6a and Extended Data 
Table 2)’. In tri-snRNP, one of these clusters is located at the interface 
between the RT domain and domain III of Snul14, and another is at 
the interface with Prp31 and the junction between the RT and 
N-terminal domains of Prp8, showing that this checkpoint can be 
bypassed when these subunit interfaces are tampered with (Fig. 6a). 
This suggests that the interactions between these components 
undergo allosteric changes, which possibly couple the guanine-nuc- 
leotide binding state of Snul114 and the pairing between 5’ splice site 
and the ACAGAGA sequence to the activation of the U4/U6 duplex 
unwinding. Understanding the activation process will require extens- 
ive interplay between structural and biochemical work—the 
tri-snRNP structure provides an important structural framework 
for further investigation of this process. 

The structural resemblance between the group II intron active site 
and the catalytic RNA core of the spliceosome*”’ endorsed the hypo- 
thesis that they evolved from a common ancestor™. On the basis of the 
similarity of the domain architecture between the group II intron 
encoded protein (IEP) and Prp8, we proposed that Prp8 evolved from 
IEP and recruited more domains and interacting proteins to assemble 
spliceosomal snRNAs’’, which derived from fragmented group II 
intron”. We placed the catalytic core of group II intron RNA in the 
tri-snRNP structure by superimposing its exon binding stem-loop* 
onto stem I of U5 snRNA. The group II intron catalytic core fits neatly 
into the active site cavity after removal of Dib1, which is absent from 
activated spliceosomes (Fig. 6b), and with small rearrangement it can 
make contacts with the thumb/linker region of Prp8, which crosslinks 
with the catalytic RNA core in the spliceosome””’. The structure of 
tri-snRNP clearly illustrates how the Prp8 domains and other spli- 
ceosomal proteins come together to assemble snRNAs and insert their 
functional segments into the active site cavity of the spliceosome”’. 
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Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Statistics. No statistical methods were used to predetermine sample size. 
Brr2-TAPS tagging for yeast U4/U6.U5 tri-snRNP purification. Primers 
specific for 55 nucleotides of the C terminus and 3’ UTR of BRR2 were used to 
PCR-amplify the TAPS-tag cassette together with the KanMX6 gene from pFA6a- 
TAPS-kanMX6, a modified version of pEA6a-TAP-kanMX6 in which the calmo- 
dulin-binding peptide tag is replaced by two tandem copies of the StrepII tag*'. 
The PCR product was used to transform yeast strain BCY123 [MATa pep4::HIS3 
prb1::LEU2 bar1::HIS6 lys2::GAL1/10-GAL4 can] ade2 trp1 ura3 his3 leu2-3, 112] 
by homologous recombination, selecting for G418-resistance. C-terminal TAPS- 
tagging of Brr2 was confirmed by PCR analysis of genomic DNA and DNA 
sequencing. 

Sample preparation. The Brr2-TAPS-tagged yeast cells (72 litres) were grown in 
YEPD medium to OD¢oo of 3.5, harvested and resuspended in lysis buffer (100 
mM HEPES KOH pH 8.0, 200 mM KCl, 2mM Mg(OAc), and 10% w/v glycerol). 
The cells were frozen and lysed by a Freezer Mill 6870 (SPEX CertiPrep). The 
crude lysate was centrifuged at 45,000 r.p.m. for 1 h. The resulting supernatants 
were incubated with IgG sepharose overnight at 4°C. The resin was washed with 
TAPS wash buffer (20 mM HEPES KOH pH 7.9, 150 mM KCl, 1 mM Mg(OAc)2 
and 0.1% NP40) and incubated with TAPS wash buffer in the presence of TEV 
protease at 4°C overnight. The flow-through was collected and incubated with 
Streptactin resin (GE Healthcare) for 3 h. The resin was washed with TAPS wash 
buffer and particles were eluted with Strep elution buffer (20 mM HEPES KOH 
pH 7.9, 150 mM KCl, 1 mM Mg(OAc)s, 0.1% NP40, 5 mM desthiobiotin). The 
eluate was subsequently applied to a 10-30% v/v glycerol gradient centrifuged at 
210,000g at 4°C in a SWTi60 rotor. The fractions from the gradient were analysed 
by SDS-PAGE for protein composition. Glycerol was removed from the peak 
fractions containing tri-snRNP by dialysis against B150 buffer (20 mM HEPES 
KOH pH 7.9, 150 mM KCl, 1 mM Mg(OAc),) before EM sample preparation 
(Extended Data Fig. 1a and 1b). 

Electron microscopy. For cryo-EM analysis, 3.5 ll of the tri-snRNP sample was 
applied to Quantifoil R2/2 or R1.2/1.3 grids which were previously coated with a 
6-nm-thick layer of homemade carbon film and glow-discharged (Extended Data 
Fig. 1c). The grids were blotted for 2 s at 4°C and plunged into liquid ethane using 
an FEI Vitrobot MKIII. The grids were loaded onto a Tecnai F30 Polara trans- 
mission electron microscope operated at 300 kV. Images were collected manually 
in low-dose mode at a calibrated magnification of 79,096. The micrographs 
were recorded on either a Falcon II ora (ultra back-thinned) Falcon III detector at 
the same calibrated pixel size of 1.77 A in movie mode at 17 frames s'. A total 
dose of 40e A~? over 2.5 s, and a defocus range of 2-4 jim were used. 

Data processing. Most steps of data processing were performed in RELION” 
unless otherwise stated. The 42 movie frames for each micrograph were corrected 
for whole-image drift using MOTIONCORR”, and contrast transfer function 
(CTF) parameters were estimated from the resulting micrographs using 
CTFFIND3 and CTFFIND4 (ref. 52). A subset of 5,000 particles was picked 
manually and extracted with a 280 pixel box, followed by reference-free 2D 
classification to obtain initial 2D class averages, which were then used as refer- 
ences for automatic particle picking. Particles resulting from the first round of 
automatic picking were extracted with a 2807 pixel box for reference-free 2D 
classification to obtain better references for the next round of automatic particle 
picking. All particles from the second round of automatic particle picking were 
manually checked before extracting them for reference-free 2D classification 
(Extended Data Fig. 1d, e). Prior to both autopicking runs, the templates were 
low-pass filtered to 20 A to prevent high-resolution noise bias. A total of 347,241 
particles from 2,035 micrographs were selected from good 2D classes, and these 
particles were used for subsequent 3D processing (Extended Data Fig. 2). 

A subset of 18,000 particles from only the best 2D classes of one of the data sets 
was used for ab initio 3D reconstruction by SIMPLE-PRIME” to obtain an initial 
model of the complex, which was low-pass filtered to 60 A for 3D classification. 
3D classification with four classes was run for 25 iterations, using an angular 
sampling of 7.5° and a regularization parameter T of 4. This resulted in two classes 
with much better reconstructed features than the others. These classes were 
combined into a subset of 179,079 particles. Auto-refinement of these particles 
resulted in a 7.6 A reconstruction. These particles were subsequently used for 
particle-based beam-induced movement correction. For these calculations, we 
only used the first 30 frames of each micrograph with running averages of seven 
frames and a standard deviation of 1 pixel for translational alignments. We used 
the new ‘particle-polishing’ approach, which fits linear tracks through the optimal 
translations for all running averages and takes into account the movements of 
neighbouring particles on the micrographs, to further improve the accuracy of the 
particle-based movement corrections”. The B-factors for the resolution and 
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dose-dependent model for radiation damage were estimated using reconstruc- 
tions from running averages of three frames. 

Auto-refinement of the movement-corrected particles with a soft mask (with 
12-pixel fall-off) around the entire map resulted in a map at 6.4 A resolution while 
refinement with a similar soft mask around the more rigid part resulted in a map 
at 5.9 A resolution (Extended Data Fig. 2 and 3a). The 5.9 A map was used for 
interpretation. Our map was also validated by a tilt-pair test (Extended Data Fig. 
3d). Local resolution analysis showed a wide range of resolution from 5.0 A to 
20 A (Extended Data Fig. 4a), indicating flexibility within some parts of the 
structure. Further 3D classification with a finer angular sampling interval of 
1.8° and local angular search range of 10° revealed conformational heterogeneity 
of the head, body and foot domains of the structure and did not improve the 
overall resolution of the map (Extended Data Fig. 2). 

We used a modified refinement approach in RELION, which we term ‘multi- 
body refinement’, to improve the density for the flexible arm domain (Extended 
Data Fig. 3b, c). In this approach, we used masks to divide the reference map into 
four ‘bodies’, approximately corresponding to the body, head, foot and arm 
domains. In each iteration of the auto-refine procedure, we independently aligned 
every experimental particle image against projections from the four distinct 
bodies. To minimize errors in these alignments, before the alignment against a 
given body, we subtracted projections from the other three bodies from the 
experimental particle. Because we assume that the four bodies may adopt differ- 
ent relative orientations in each particle, we kept track of the most likely orienta- 
tion for each of the four bodies for every particle during the course of the 
refinement. Thereby, subtraction of the other bodies should become ever more 
accurate, and this resulted in four sets of relative orientations for each particle. To 
express our expectation that the relative movements between the different bodies 
were limited, we used only local orientation searches (+22.5°) in the multi-body 
refinement, and centred the local searches around the orientations determined for 
the unmasked auto-refinement mentioned above. Details of this methodology 
will be described elsewhere (S.H.W.S., unpublished results), and its implementa- 
tion will be made available through incorporation into RELION. 

All refinements used gold standard Fourier shell correlation (FSC) calcula- 

tions* and reported resolutions are based on the FSC = 0.143 criterion. The 
FSC curves are calculated using a soft spherical mask (Extended Data Fig. 4d). 
Prior to visualization, all maps were corrected for the modulation transfer func- 
tion of the detector and sharpened by applying a negative B-factor. 
Local resolution analyses. Local resolution analyses were performed by 
Resmap™ and compared with those calculated by us for each protein/RNA com- 
ponent of the map (Extended Data Fig. 4a-c). For the latter calculations, FSC 
curves are calculated using a soft spherical mask (with a 30-pixel fall-off) around 
each protein/RNA component of interest. Convolution effects of the masks on the 
FSC curves were corrected using high-resolution noise substitution’. Resolution 
was estimated at FSC = 0.143. These calculations were performed for each of the 
following components: Prp8 large domain, Prp8 RNaseH domain, Prp8 Jab1/ 
MPN domain, Prp8 N-terminal domain, Brr2, U4 Sm with U4 snRNA 3’ stem- 
loop, U5 Sm with Sm site, Snu114, Dib1, Prp6, Prp3, Prp4, Snu13, Prp31, LSm, U5 
snRNA and U4/U6 snRNA. Extended Data Fig. 4d shows some representative 
curves from these calculations. 

FSC curves of model versus map were calculated using the Xmipp package” 
and the reported resolutions were based on the FSC = 0.25 criterion. FSC curves 
of model versus map were calculated for not only the entire model of all compo- 
nents but also different parts of the maps. The map of each modelled component 
was extracted from the tri-snRNP map using a soft mask (with a 5-pixel soft edge) 
surrounding the component. A map of each model was created by the program 
pdb2mrce within the EMAN package*’. Some proteins/domains that are close 
together were grouped together for these calculations, including Prp8 
N-terminal domain/Dibl, Brr2/Prp8-Jabl/MPN domain and Prp3/Prp4. 
Extended Data Fig. 4e shows some representative curves from these calculations. 
Model fitting and building. Locations of available X-ray or homology models 
were fitted initially by visual inspection of the tri-snRNP map and low-pass filtered 
maps (to 10 A) were generated for each model in Chimera followed by fit optim- 
ization in Chimera™. For the LSm proteins that are in the flexible arm region, the 
map resulting from multi-body refinement (Extended Data Fig. 3c and 51) was 
used for fitting. Further rigid-body fitting was performed in Coot®. The homology 
model for Snul14 was prepared by I-TASSER web server® based on the crystal 
structure of the yeast elongation factor 2 (ref. 26; PDB 1NOV) (Fig. 4 and Extended 
Data Fig. 5c). The model was manually inspected and the disordered regions were 
removed. The model for the ferredoxin-like domain of Prp3 is available at the 
Yeast Genome Center (http://www. yeastrc.org), which contains its structure pre- 
dictions®'. The model with the highest Mammoth Confidence Metric (MCM) 
score was selected for fitting. For Prp4, the protein sequence was input into 
Robetta Beta Full-chain Protein Structure (http://robetta.bakerlab.org), which 
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yielded a model for the C-terminal part of Prp4 based on the structure of 
the WDRS protein” (PDB 3MXX). Double-stranded RNA helices and idealized 
polyalanine helices were built into the masked map in Coot when possible. U4 
snRNA 5’ stem-loop was modelled based on the structure of the human Prp31- 
15.5K-U4 snRNA complex” (PDB 2O0ZB) using ModeRNA modelling tool®. 
Yeast Snul3 structure“ (PDB 2ALE) was fitted into the map. U4 snRNA 3’ 
stem-loop partial model was adapted from the structure of the human U4 
snRNP core domain® (PDB 4WZ)). The short and long forms of U5 snRNA® 
are present in our sample (Extended Data Fig. 1b) but no density for 3’ stem-loop 
was observed, presumably because 3’ stem-loop attached to the Sm site with a 
long single-stranded stretch is disordered or the particle population with the long 
U5 snRNA is classified out during classification. U5 snRNP Sm core with only the 
Sm site was also adapted from the human U4 snRNP core domain. The LSm 
proteins” (PDB 4M77) were placed in the low-resolution arm region of the 
map with the flat surface of the LSm complex facing the entrance side of U6 
snRNA. The register of the LSm proteins cannot be accurately determined. 
Human Dib1 structure” (PDB 1QGV) was used for fitting. Extended Data 
Table 1 and Extended Data Fig. 4f summarize all the details of tri-snRNP compo- 
nents and modelling in our study. The active site cavity of Prp8 was described 
previously’® and defined by crosslinks with crucial elements of U5 snRNA, U6 
snRNA and pre-mRNA’ and suppressors of defective splice site mutations”. 
U4-cs1 mutants have been described**”. Extended Data Table 2 summarizes all 
the U4-cs1 mutants and their locations in tri-snRNP. 

Map and model visualization. Maps were visualized in Chimera”*®. Map segmenta- 
tion was performed in Chimera using each of the fitted models and the ‘zone- 
masking’ function (Fig. 1 and Extended Data Figs 5 and 6b, d). The LSm protein 
density was obtained from multi-body refinement and low-pass filtered to 20 A. For 
all the remaining components, the sharpened tri-snRNP map (B = —214 A’) low- 
pass filtered to 5.9 A was used. Figures were generated using either Chimera®* or 
PyMOL (http://www.pymol.org) and the video was made in Chimera”. 

ATP assays. Purified tri-snRNP from glycerol gradient (~25 nM) was incubated 
at 30°C for 30 min in the presence of either no nucleotide or with each of the 
following nucleotide combinations: ATP, ATP/GTP, ATP/GDP, ADP and 
AMPPNP (1 mM each). The samples (10 il) were loaded onto a native agarose 
gel (0.5% in TB buffer supplemented with 1 mM MgCl.) and run at 75 V at 4°C 
for 2.5 h. The gel was stained with ethidium bromide for 1 h before being imaged 
by a Syngene UV imager (Extended Data Fig. 8a). For negative staining, the 
sample was also treated similarly and stained with 2% uranyl acetate before 
imaging on a Tecnai T12 transmission electron microscope operated at 120 kV 
(Extended Data Fig. 8b, c). 
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Extended Data Figure 1 | U4/U6.U5 tri-snRNP sample used for this study. _ gel showing RNA compositions. c, Electron cryo-micrograph of tri-snRNP 


a, Coomassie-blue-stained SDS-PAGE gel showing protein composition of where the carbon-coated grid was discharged in N-amylamine. d, e, Reference- 
the purified tri-snRNP. U5-, U4/U6- and tri-snRNP-specific proteins are free two-dimensional class averages of a data set collected on a grid 
labelled in blue, red and teal, respectively. Sm proteins present in both U5 discharged in air and N-amylamine, respectively. 


and U4/U6 are in black. b, Toluidine-blue-stained denaturing acrylamide (9%) 
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Extended Data Figure 2 | Classification and refinement procedures used in 
this study. A total of 367,327 particles were subjected to reference-free 2D 
classification. A subset of 347,241 particles from good 2D classes was selected 
for 3D classification using an initial model obtained from SIMPLE-PRIME™, 
which was low-pass filtered to 60 A. The data were divided into four 3D classes, 
two of which (a total of 179,079 particles) showed better features and were 
combined for refinement. This resulted in a 7.6 A reconstruction. To further 
improve the reconstruction, these particles were subjected to beam-induced 
motion correction (particle polishing)**. Refinement of these polished particles 


19.4% 


27.6% 


11.5% 11.6% 


11.4% 19.6% 


7.0A 


with a soft mask around the rigid part of the map (as indicated by the red 
envelope) yielded a 5.9 A reconstruction while refinement with a mask around 
the whole map yielded a 6.4 A reconstruction. The polished particles were also 
subject to further 3D classification with a finer angular sampling of 1.8°. 

The most populated class (47,674 particles), which also has the best rotational 
accuracy, was refined with a soft mask around the whole density. This resulted 
in a 7.0 A reconstruction. In this study, the 5.9 A reconstruction was used 

for subsequent biological interpretation. All steps were performed in RELION” 
unless otherwise stated. 
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Extended Data Figure 3 | CryoEM maps and tilt-pair validation. a, CryoEM 
density of the whole tri-snRNP at 5.9 A resolution by ‘gold standard’ Fourier 
shell correlation (FSC) of 0.143 criterion at two different contour levels. The 
high contour map (gold) shows well-resolved densities for protein and RNA 
helices and flat densities for B-sheets. The low contour map (silver) shows 
densities for the more flexible head and arm. The map was sharpened by a 
B-factor of —214 A? and low-pass filtered to 5.9 A as determined by RELION. 
b, The unsharpened full map of tri-snRNP. c, The map resulting from multi- 
body refinement, in which tri-snRNP is divided into four parts: the head, body, 
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Tilt direction 90° 


# a tilt angle=90° 


tilt angle=60° 


tilt angle=30% 


180° sd . 3 
8 Tilt direction 0° 


arm and foot. This resulted in better density for the arm domain (indicated by 
red circles), which is at 20 A resolution. d, Tilt-pair validation plot for tri- 
snRNP. This was obtained from 1,196 particles from 32 micrograph pairs, 
imaged at 0° and 10° tilt angles. The position of each dot represents the 
direction and the amount of tilting for a particle pair in polar coordinates. Blue 
dots correspond to in-plane tilt transformations; red and purple dots 
correspond to out-of-plane tilt transformations. Blue dots cluster in the same 
region of the plot at a tilt angle of approximately 10° as indicated by the red 
circle. 
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Extended Data Figure 4 | Resolution estimation of tri-snRNP map. a, Local 
resolution of the tri-snaRNP map estimated by ResMap using the colour 
scheme shown in panel c. b, Local resolution of the tri-snRNP map calculated 
by ‘gold-standard’ FSC. For each component of the map that we modelled 
protein/RNA components, a soft mask (with a 30-pixel soft edge) surrounding 
the region of interest was prepared and used for FSC calculations. Convolution 
effects of the masks on the FSC curves were corrected using high-resolution 
noise substitution®*. Resolution was estimated at FSC = 0.143. Local resolution 
for the unmodelled region of the map (in red) was not estimated. ¢, Local 
resolution of model versus map. The map of each modelled component was 
extracted from the map using a soft mask (with a 5-pixel soft edge) surrounding 
the component. The model was converted into density by EMAN*. FSC of 
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model versus map was calculated using Xmipp**. The map is coloured 
according to resolution estimates based on a FSC threshold of 0.25. The lower 
resolution estimates from the FSC of model versus map compared to the 
estimates from ResMap and the gold-standard FSCs are explained by the nature 
of our models. Because of the limited resolution of our map, we did not perform 
full atomic refinement, but placed known crystal structures and homology 
models as rigid bodies in the map. d, Gold-standard FSC curves for the 
whole tri-snRNP map and some of its components calculated as described in 
b. e, FSC curves of model versus map for the whole model and some of the 
components. f, The full tri-saRNP map in which portions of the structure 
produced from crystal structures, homology modelling and de novo building 
or unmodelled are coloured as indicated. 
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Extended Data Figure 5 | Fitting of protein components into tri-snRNP 
map. a, Prp8(885-2,413) crystal structure’® (PDB 4143, green) and additional 
helices built de novo assigned to the N terminus of Prp8 (blue). b, Brr2-Jab1/ 
MPN complex”! (PDB 4BGD). c, Snul14 homology model based on EF2 

(ref. 26). d, The Prp6 TPR motifs built into the tri-snRNP map. e, U5 Sm 
proteins (grey) with Sm site (blue) based on the human U4 Sm structure (PDB 
4WZ)). f; Dib1 (ref. 29) (PDB 1QGV). g, (i) Prp31. (ii), Comparison between 
the crystal structure of human Prp31(78-333) (ref. 33) (PDB 2OZB, grey) 
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and that in tri-snRNP (yellow and blue). The coiled-coil domain (yellow) 
rotates by 60° in tri-snRNP with respect to the Nop domain (grey). Additional 
helices (blue) that extend from the N and C termini were built. h, U4 Sm 
proteins with part of U4 snRNA (blue) based on the human U4 Sm structure. 
i, Prp3 model. The ferredoxin-like domain was obtained from homology 
modelling while the extra helices were built de novo. j, Prp4 WD40 homology 
model with the extra helices built de novo. k, Snu13 (ref. 64) (PDB 2ALE). 

1, U6 LSm proteins” (PDB 4M77). 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a es 
10§ 5’-Stem Loop 
At U6 snRNA 
g—C 
td 
c— § 20 
§+g 30 40 
¥ ¢ F NACA 90 100 110 
5’ icine “Ac, 50 80 AACCGUUUUACAAAGAGAUUUAUUUCGUUUU-3 
"th, As\\ya 5? 
A 70 _cavd\<ccl 
F 90 79h cuss AN 
3? -UUUCCAUAAGGUUUUU. UGCCAGACCAAAUAUUAAUUUAAAGU/ , Ucn 60 Cc Weel 
160 150 -f 70 66677 Seq WUE Stem Il 
7 AGUA N BGC 
M40 AO Stem 160 Su Ae 
U4 snRNA = d ri 
2 if 
A—U 100 50. 6—¢ 
1308 thy 
U A—U 
U G—c 
4, o¢ A 30 
i: ‘8 
Ag 110 oA 
40 u 
120 G .¢ c 
Yuyl 


5’-Stem Loop 
3’-Stem Loop 


Sm site 


a 


Vw 
9066 
Stem| u—a 
G—c 
6—c 
cA—U-110 
P H 
80g c 
, 60 70Up c 
Variable yf Ucccunuacaacud 
ale @ 
Stem-loop ‘y, ccaccusucaacaa 
so 40 os 
¢c—G 
A-y Stem Il Stem II 
&—6 120 
ace 
30-A—U : 
om 3’ Stem Loop 
ase Ay 
“5 au 
Internal Loop II 20 A 130 © S200 
Usg ch c—6 , 
Ac—G UA 
100-2 stem I ods Ot 
U-A -1 uA y 2 
c—¢ 
A—U 150 160 170 10 ¢ ¢ 710 Stem Ill 
U5 snRNA 


5’-aac “cceuuacusueeecuueccauauuuuuucsaacuuuucu® Seu-3’ 
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map. a,c, The sequences and predicted secondary structures of U4/U6 snRNA assigned to U5 snRNA is also shown in d. 
and the long version of U5 snRNA, respectively. b, d, The maps of the fitted 
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Extended Data Figure 7 | Sequence alignment of yeast and human Snul14 _ Important sequence elements are also shown. The greyscale shading indicates 
with yeast and human elongation factor 2 (EF-2). The secondary structures __ the level of sequence conservation. A higher level of conservation is shown in a 
of our homology model for yeast Snul14 and the yeast EF-2 (ref. 26) (PDB darker shade. 

1NOV) are shown on the top and bottom of the alignment, respectively. 
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Extended Data Figure 8 | The effect of ATP on Brr2-TAPS purified tri- 
snRNP. a, Ethidium-bromide-stained native agarose gel (0.5%) showing the 
effects of ATP addition to Brr2-TAPS purified tri-snRNP used in this study. 
Upon ATP addition either without or with GTP/GDP, tri-snRNP fell apart 
(lanes 1-4). Under the same conditions, the addition of ADP or the non- 
hydrolysable ATP-analogue, AMPPNP, had no effects on the complex (lanes 5, 
6). b, c, The effect of ATP addition observed by negative stain microscopy. 
When ATP was not present, tri-snRNP particles could be observed. When ATP 
was added to the sample before grid preparations, tri-snRNP particles fell apart 


Cc +ATP 


U4 Sm rN 


U4 3’SL 


as observed by many small components on the micrograph rather than tri- 
snRNP particles. d, Tri-snRNP model where U4/U6 snRNP proteins are not 
shown. In tri-snRNP, Brr2-Prp8” complex is loosely associated to the 
remaining U5 snRNP components including Prp8'“®°, Prp8?\*5*", PrpsN'*™, 
Snul 14, Dib1, U5 Sm proteins and U5 snRNA. After U4/U6 snRNA unwinding 
by Brr2, Brr2-Prp8’” could be repositioned within the spliceosome. e, A 
schematic showing the arrangement of tri-snRNP protein and RNA 
components. 
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Extended Data Table 1 | Components and modelling of yeast U4/U6.U5 tri-snRNP 


protein total residues MW. Domain PDB code 
N-terminal domain 1-884 a-helices modelled 
RT-like 885-1251 
Thumb/X 1257-1375 
Prp& 2413 279,299 Linker 1376-1649 4143 
Endonuclease 1653-1824 
RNaseH-like 1839-1029 
Jab1/MPN 2150-2396 4BGD 
N-terminal domain Ve not modelled 
Brr2 2163 246,125 parle 
‘ N-termnal helicase cassette 478-1309 4BGD 
C-terminal helicase cassette 1330-2163 
N-terminal domain 1-114 not modelled 
G domain 120-443 
domain Il 446-580 
nul 1008 114,025 domain Il 603-671 homology model (1NOV) 
domain IV 675-853 
US snRNP domain V 856-990 
Dib1 143 16,774 thioredoxin-like not modelled 
SmB 196 22,403 Sm fold 
SmD3 110 11,229 Sm fold 
SmD1 146 16,288 Sm fold 
SmD2 110 12,856 Sm fold 4WZJ(human model) 
SmE 94 10,373 Sm fold 
SmF 96 9,659 Sm fold 
SmG 77 8,479 Sm fold 
Loop 1 92-102 not modelled 
Stem 1 84-91;103-110 A-form double helix 
IL1 75-83;111-113 not modelled 
VSL 41-74 A-form double helix 
ae 2u6 cies Stem 2 28-40;114-125 A-form double helix 
IL2 13-27;126-135 not modelled 
Stem 3 4-12;136-144 A-form double helix 
3'SL 185-212 not modelled 
Snu13 126 13,570 2ALE 
N-terminal domain a-helices modelled 
Prp31 494 56,305 eles eoleional 20ZB (human model) 
Nop domain 
C-termal domain a-helices modelled 
N-terminal domain a-helices modelled 
prp3 469 55,877 Ferredoxin-like domain model obtained from yeast genome 
center 
N-terminal domain a-helices modelled 
prp4 465 52,425 
f-propeller domain 166-465 3MXxX from Robetta prediction 
SmB 196 22,403 Sm fold 
SmD3 110 11,229 Sm fold 
SmD1 146 16,288 Sm fold 
SmD2 110 12,856 Sm fold AWZ3J (human model) 
SmE 94 10,373 Sm fold 
U4/U6 snRNP SmF 96 9,659 Sm fold 
SmG 77 8,479 Sm fold 
LSm2 95 11,164 Sm fold 
LSm3 89 10,020 Sm fold 
LSm4 72 20,304 Sm fold 
LSm5 93 10,415 Sm fold 4M77 
LSm6 86 9,396 Sm fold 
LSm7 15 13,010 Sm fold 
LSm8& 09 12,385 Sm fold 
Stem | 57-64 A-form double helix 
Stem II 1-17 A-form double helix 
U4 snRNA 60 51,390 S'SL 20-53 homology model (20ZB) 
central domain 65-80 partial homology model (2P6R) 
3'SL 91-142 partial model 4WZJ (human model) 
5'SL 1-25 not modelled 
U6 snRNA 12 36,088 Stem | 55-62 A-form double helix 
Stem Il 64-80 A-form double helix 
1-191 not modelled 
Prp6 899 104,234 TPR domain 192-899 a-helices modelled 
tri-snRNP specific Snob _ 66,426 not modelled 
Prp38 242 27,957 not modelled 
Snu23 194 22,682 C2H2 zinc finger-like not modelled 
Spp381 291 33,764 not modelled 


See Methods for details. 
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Extended Data Table 2 | U4-cs1 suppressors 


region mutations domains locations contact 
R236G 
regiona  1261P N-terminal domain unknown 
meme 280 
K611R 
E624G 
N643S 
regionb 644A N-terminal domain unknown 
D651G or N 
H659P 
K684E 
E788G or V 
N796S 
regionc W856R N-terminal domain unknown 
E860K 
OE ———————— ee 
D1094A or NorV loop in4 stranded -sheet 
M1095T loop in4 stranded -sheet 
cluster 1 be Bee ened tee Interface with Prp31 
N1099K nearin4 stranded f-sheet 
11104M RT domain near in4 stranded f-sheet 
I nearin stranded sheet 
P1191LorS orT within loop following «12 
cluster2_ _D1192Y within loop following «12 _ interface with Snu114 domain Ill 
N1194D within loop following «12 
ec NC (0500, (oT 
L1634F top surface 
L1641F top surface 
cluster3 11685! endonuclease top surface 
P1688L or R top surface 
A1754V side surface 
N1809D side surface 
"ss FIBSIL“‘e”””*~<s~si‘SOOOO#COMMMErSUrface|=©=©)©)0)0)0U0U0UUUUUUUUO 
V1860D or N 6-finger 
regionf 11861P RNaseH B-finger 
V1862A or Dor Y 6-finger 
11875T B-finger 


All suppressor mutants are described in Kuhn and Brow®’ and Kuhn et al.’°. 
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The core spliceosome as target and 
effector of non-canonical ATM signalling 


Maria Tresini’, Daniél O. Warmerdam?, Petros Kolovos*, Loes Snijder', Mischa G. Vrouwe’, Jeroen A. A. Demmers’, 
Wilfred F. J. van IJcken®, Frank G. Grosveld*, René H. Medema?, Jan H. J. Hoeijmakers', Leon H. F. Mullenders*, 
Wim Vermeulen! & Jurgen A. Marteijn! 


In response to DNA damage, tissue homoeostasis is ensured by protein networks promoting DNA repair, cell cycle 
arrest or apoptosis. DNA damage response signalling pathways coordinate these processes, partly by propagating 
gene-expression-modulating signals. DNA damage influences not only the abundance of messenger RNAs, but also 
their coding information through alternative splicing. Here we show that transcription-blocking DNA lesions 
promote chromatin displacement of late-stage spliceosomes and initiate a positive feedback loop centred on the 
signalling kinase ATM. We propose that initial spliceosome displacement and subsequent R-loop formation is 
triggered by pausing of RNA polymerase at DNA lesions. In turn, R-loops activate ATM, which signals to impede 
spliceosome organization further and augment ultraviolet-irradiation-triggered alternative splicing at the 
genome-wide level. Our findings define R-loop-dependent ATM activation by transcription-blocking lesions as an 
important event in the DNA damage response of non-replicating cells, and highlight a key role for spliceosome 


displacement in this process. 


The DNA damage response (DDR), an intricate protein network that 
promotes DNA repair, translesion synthesis, cell cycle arrest or apop- 
tosis, has evolved to counteract the detrimental effects of DNA lesions! >. 
At the core of the DDR, the ataxia telangiectasia mutated (ATM) and 
ataxia telangiectasia and Rad3-related (ATR) signalling pathways 
coordinate these processes in response to distinct types of DNA damage: 
ATR to single-stranded DNA damage, and ATM to double-strand DNA 
breaks (DSBs) and chromatin modifications'*”. These signalling net- 
works utilize post-translational modifications and protein-protein 
interactions to elicit the initial stages of the cellular response. Later 
DDR stages involve changes in gene expression. Emerging evidence 
supports that DNA damage influences not only the expression levels 
ofits target genes, by altering transcription rates and mRNA half-life, but 
also exon selection and ultimately their coding potential’. 

Production of mature, protein-coding transcripts depends on the 
selective intron removal catalysed by the spliceosome, a dynamic 
ribonucleoprotein complex consisting of five small nuclear ribonu- 
cleoprotein (snRNP) complexes (U1, U2, U4, U5 and U6), and a large 
number of accessory proteins”*. Exon/intron definition by U1 and U2 
snRNPs stimulates the recruitment of the pre-assembled U4/U6.U5 
tri-snRNP and numerous non-snRNP proteins. Following U1/U4 
displacement and extensive conformational rearrangements, the 
two-step splicing reaction is catalysed by the mature, catalytically 
active spliceosome composed of U2, U5 and U6 snRNPs*. 

The vast majority of mammalian genes are alternatively spliced to 
produce multiple mRNA variants from a single gene’, thus expanding 
protein diversity. Numerous mechanisms have evolved to provide the 
spliceosome with the plasticity required for selective exon inclusion, 
without compromising splicing fidelity’. These range from the pres- 
ence of cis-acting elements on the transcript itself to post-translational 
modifications of spliceosomal proteins, which are subject to intracel- 
lular and environmental cues. Additionally, since most introns are 


spliced co-transcriptionally within the chromatin environment, splic- 
ing decisions are subject to spatiotemporal control imposed by tran- 
scribing polymerases and interaction with chromatin remodellers and 
histone marks'’"'?, Exon selection is also influenced by DNA 
damage®’’. There is evidence for a broad range of damage-induced 
alternative splicing events, including alternative exon inclusion and 
exon skipping, and production of proteins with altered (often pro- 
apoptotic) function’ '*. DNA-damage-induced alternative splicing 
has been attributed to changes in the processivity rate of RNA poly- 
merase’ (kinetic coupling), or changes in the interactions between 
the polymerase and splicing regulators'*’* (recruitment coupling), 
under the assumption that the core spliceosome is largely unaffected. 
Here we present evidence that DNA damage triggers specific pro- 
found changes in spliceosome organization, primarily that of late- 
stage spliceosomes. Additionally, we identify reciprocal regulation 
between ATM-controlled DDR signalling and the core spliceosome, 
and show that in response to transcription-blocking DNA lesions, 
non-canonical ATM activation contributes to the selection of genetic 
information ultimately included in mature transcripts. 


DNA damage targets core spliceosomes 


To gain mechanistic insight on the influence of DNA damage to 
chromatin-associated DDR processes, we used stable isotope labelling 
with amino acids in cell culture (SILAC)-based quantitative proteo- 
mic analysis’’ to characterize ultraviolet (UV)-irradiation-triggered 
changes in chromatin composition (Extended Data Fig. la-c). 
Indirect effects of replication stress were avoided by use of quiescent, 
human dermal fibroblasts (HDFs). UV-induced photolesions inhibit 
transcription by impeding RNAPII progression, and as anticipated we 
observed a UV-dependent chromatin-depletion of core splicing fac- 
tors. Surprisingly though, this depletion was selective; chromatin 
abundance of all detected U2 and U5 snRNP splicing factors was 
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Figure 1 | DNA-damage-triggered chromatin displacement of activated 
spliceosomes. a, b, UV-induced changes in chromatin association of 
spliceosome components in quiescent HDFs. a, Immunoblots (right) and 
quantification (left) of splicing factor-chromatin association. b, Chromatin- 
associated snRNAs assayed by quantitative PCR (qPCR) and normalized to 
HotAir non-coding RNA (n = 4, mean = s.d., t-test). c, d, Immunoblots (right) 


substantially decreased in irradiated cells while abundance of U1 and 
U4 snRNP splicing factors was not significantly affected (Extended 
Data Fig. 1d and Supplementary Table 1). Considering that spliceo- 
somes containing exclusively U2/U5/U6 snRNPs are formed at later 
stages of the splicing cycle, following eviction of U1 and U4 from the 
assembled spliceosome®, we concluded that DNA damage preferen- 
tially targets late-maturation-stage spliceosomes, unlike chemical tran- 
scription inhibition that also affects early-stage spliceosome assembly’. 

The proteomic results were validated by chromatin fractionation 
and immunoblotting, for Ul (U1A, U1C), U2 (SF3al, SF3b2), U4 
(PRP3, NHP2L1) and U5 (SNRNP40, PRP8) snRNP-specific pro- 
teins® (Fig. 1a). We also assayed by qPCR the chromatin association 
of all spliceosomal snRNAs. UV irradiation resulted in preferential 
chromatin depletion of U2, U5 and U6 snRNAs, while U1 and U4 
were essentially unaffected (Fig. 1b). Depletion of U2 and U5 snRNP 
proteins was time- (Fig. 1c) and dose-dependent (Fig. 1d), but inde- 
pendent of proliferation status and cell type (Fig. 1a, c, d). Chromatin- 
depletion of U2 and U5 snRNP splicing factors was independent of 
proteasome activity (Fig. 1d), suggesting that depletion was not 
caused by splicing factor degradation but rather by relocalization. 
In agreement, total cellular levels of all tested splicing factors were 
unaffected by DNA damage (Extended Data Fig. le). Splicing factor 
relocalization was verified by immunofluorescence microscopy in 
cells in which DNA damage was inflicted in a small subnuclear 
area. A representative example in Fig. 2a depicts depletion of the 
U5-associated protein SNRNP40 from DNA damage sites that were 
identified by cyclobutane pyrimidine dimer (CPD) immunodetec- 
tion’’. Re-localization was monitored in real-time, using validated cell 
lines (Extended Data Fig. 2a—d) stably expressing GFP-tagged mem- 
bers of U2 (SF3al) and U5 (SNRNP40, PRP8) snRNPs. Subnuclear 
damage infliction by UVC microbeam irradiation” resulted in rapid 
depletion from irradiated sites of GFP-tagged U2 and U5 snRNP 
splicing factors but not of U1 and U4 (Fig. 2b and Extended Data 
Fig. 3a-c). Inhibition of transcription-initiation prevented this 
depletion indicating that the displaced proteins were actively involved 
in splicing (Extended Data Fig. 3d). Irradiation of the entire cell 
resulted in prominent changes in splicing factor localization as evi- 
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and quantification (left) of splicing factor-chromatin association in U2OS cells. 
c, Time post UV irradiation. d, UV dose-response and lack of influence of 
the proteasome inhibitor MG132. Graphs in ¢, d show signal intensities 
normalized to H2A (n = 3, mean + s.d., t-test and one-way ANOVA). 

**P < 0.01, ***P < 0.001. 


denced by speckle reorganization and enlargement (Extended Data 
Fig. 4a, b). To further investigate the relocalization kinetics of GFP- 
tagged SFs, we measured their mobility by fluorescence recovery 
after photobleaching (FRAP). We observed substantial and UV- 
dose-dependent increases in the mobility of U2 and U5 snRNP factors 
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Figure 2 | Mobilization and displacement of mature spliceosomes from 
sites of UVC-induced DNA damage. a, Immunofluorescence detection of 
SNRNP40 and CPDs in U20OS cells exposed to UV irradiation through porous 
membranes. b, SNRNP40-GFP depletion from UVC laser microbeam 
irradiation sites in U2OS cells; typical image (top) and fluorescence quantifi 
cation of 20 cells (bottom). a, b, Images were obtained at 63 magnification. 
c, FRAP of UV-triggered SNRNP40-GFP mobilization in U2OS and quiescent 
HDFs (n = 25). d, FRAP of free eGFP or GFP-tagged splicing factors in 
UV-irradiated quiescent HDFs. Change in mobility was calculated as the 
fluorescence of irradiated cells - fluorescence of non-irradiated cells at 1 min 
post-bleaching (n = 25, mean = s.e.m., f-test and one-way ANOVA). 
*P<0.05, **P< 0.01, ***P < 0.001. 
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but not of U1 and U4, at 1 hour post-irradiation (Fig. 2c, d). In agree- 
ment with the chromatin fractionation assays (Fig. 1d), mobilization 
was independent of proteasome activity, confirming that the UV-trig- 
gered mobilization is not caused by proteasome-dependent degrada- 
tion (Extended Data Fig. 5d). 

The UV-dependent chromatin depletion of snRNAs and proteins 
participating in late-stage spliceosomes, loss of association with 
elongating RNAPII (Extended Data Fig. 1f), rapid displacement from 
DNA damage sites and mobilization of U2 and U5 snRNP factors, 
indicate that UV irradiation influences late-stage RNAPII-associated 
spliceosomes. 


DNA-damage-specific spliceosome mobilization 


Next we used FRAP to address whether spliceosome mobilization 
is caused by specific DNA lesions or is a general response to 
macromolecular damage. Significant splicing factor mobilization 
was caused by genotoxins inflicting transcription-blocking DNA 
lesions (UV irradiation, Illudin S), but not oxidative damage 
(tert-butyl-hydroxide, rotenone, ionizing radiation), DSBs (ionizing 
radiation) or DNA inter-strand crosslinks (mitomycin C). This 
specificity argues that the observed mobilization does not result 
from non-specific RNA/DNA damage but only from DNA lesions 
that interrupt transcription (Fig. 3a and Extended Data Fig. 5a, b) 
and are substrates of the transcription-coupled nucleotide excision 
repair (TC-NER) pathway”’’. Notably, HDFs deficient in either 
TC-NER, or global-genome (GG)-NER (lacking CSB and XPC activ- 
ities respectively), or in both (lacking XPA), show no impairments 
either in damage-triggered spliceosome mobilization (Fig. 3b) or in 
chromatin-displacement of endogenous U2 and U5 snRNP splicing 
factors (Extended Data Fig. 5c). Thus, the influence of transcription- 
blocking lesions in splicing factor localization is independent from 
NER complex assembly indicating that pausing of elongating 
RNAPII is necessary and sufficient to trigger chromatin displacement 
of late-stage spliceosomes. 


Spliceosome mobilization by DDR signals 

Transcription inhibition by chemicals that target RNAPII mobilize 
splicing factors of all snRNPs, unlike UV irradiation that preferen- 
tially targets those participating in late-stage complexes (Fig. 4b 
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Figure 3 | Chromatin displacement of mature spliceosomes is caused by 
RNAPII-blocking lesions and is NER-independent. a, FRAP of SNRNP40- 
GFP in quiescent HDFs exposed to genotoxins (n = 30, mean + s.e.m., one- 
way ANOVA). IR, ionizing radiation; tBH, tert-butylhydroperoxide; MM-C, 
mitomycin-C. b, UV-triggered mobilization of SNRNP40-GFP in HDFs 
deficient in GG-NER (XPC), TC-NER (CSB) or both (XPA) 

(n = 30, mean + s.e.m., t-test). ***P < 0.001. 
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and Extended Data Fig. 5e). This preferential mobilization implies 
distinct mechanisms of action between UV-irradiation-dependent 
and chemically-induced transcription inhibition. However, to form- 
ally exclude the possibility that transcription-blocking DNA lesions 
mobilize spliceosomes exclusively through RNAPII arrest, we used 
5,6-dichloro-1-f-D-ribofuranosyl-benzimidazole (DRB) to inhibit 
transcription to the same extent as UV irradiation. Transcription 
arrest was evaluated by measuring reduced 5-ethynyl-uridine (SEU) 
incorporation into newly synthesized RNA (Fig. 4a). Both treatments 
increased spliceosome mobility (Fig. 4b and Extended Data Fig. 6a) 
and their combination had an additive effect (Extended Data Fig. 6b). 
Notably, UV irradiation had a more profound splicing-factor- 
mobilizing effect than DRB (at equal transcription-inhibiting doses), 
indicating that transcription inhibition alone is not sufficient to attain 
the extensive mobilization triggered by UV irradiation (Fig. 4b and 
Extended Data Fig. 6a). 

Pausing of RNAPII at DNA lesions not only halts transcription, but 
also activates DDR signalling pathways that modulate the cellular res- 
ponse via post-translational modifications'**. Considering that many 
core splicing factors have been identified as DDR-kinase substrates’, 
we used the broad-range DDR-kinase inhibitor caffeine to evaluate if 
DDR signalling influences spliceosome organization. Caffeine partially 
suppressed the UV-dependent spliceosome mobilization but had no 
influence on the DRB-dependent mobilization, confirming that the 
two processes are, in part, mechanistically distinct (Fig. 4d). 

To dissect which DDR signalling system augments the UV- 
triggered spliceosome mobilization, cells were treated with specific 
inhibitors of the major caffeine-sensitive DDR kinases: ATM, ATR 
and DNA-dependant protein kinase (DNA-PK). Neither ATR nor 
DNA-PK inhibition had a significant effect (Fig. 4c and Extended 
Data Fig. 6d). Surprisingly, ATM inhibition in non-replicating cells 
suppressed splicing factor mobilization to levels similar to caffeine 
(Fig. 4c and Extended Data Fig. 6d), while it had no influence on DRB- 
mediated mobilization (Extended Data Fig. 6h). The dependency of 
UV-triggered spliceosome mobilization on ATM signalling was con- 
firmed by the impaired splicing factor mobilization in HDFs derived 
from an ataxia telangiectasia patient compared to those of a healthy 
donor (Fig. 4e and Extended Data Fig. 6c). Thus DNA-damage- 
triggered spliceosome mobilization results from the combined con- 
tribution of transcription inhibition and ATM signalling. 

To evaluate the impact of ATM-dependent spliceosome mobiliza- 
tion on pre-mRNA processing, we assayed splicing efficiency in a 
select panel of DDR- and cell-cycle-related genes**. Quiescent RPE 
cells were UV irradiated in the absence or presence of the ATM 
inhibitor and intron retention was assayed by reverse-transcription 
PCR (RT-PCR)™. UV irradiation resulted in increased ATM- 
dependent intron retention (Fig. 4f and Extended Data Fig. 6f), while 
transcription inhibition by DRB had minor, and ATM-independent, 
effects. Specificity of the ATM inhibitor was confirmed by small- 
interfering RNA (siRNA)-mediated ATM silencing which gave ident- 
ical results (Extended Data Fig. 6e). 

To investigate the genome-wide influence of UV irradiation on alter- 
native splicing, as well as the ATM contribution in UV-irradiation- 
dependent gene expression and mRNA processing changes, we 
performed RNA sequencing (RNA-seq) on cells that were untreated 
or UV-irradiated in the presence or absence of the ATM inhibitor. We 
observed that a substantial number of UV-induced gene expression 
changes depend on ATM activity (Extended Data Fig. 6g), revealing 
a previously unknown contribution of ATM signalling in the UV- 
regulated transcriptome. Importantly, UV irradiation resulted in 
widespread splicing changes, a subset of which (up to 40%) was partly 
ATM-dependent, demonstrating the genome-wide influence of ATM 
not only in mRNA abundance but also in UV-induced alternative 
splicing (Fig. 4g and Supplementary Table 2). 

Collectively, these findings demonstrate that UV irradiation influ- 
ences gene expression in an ATM-dependent manner, and that ATM 
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Figure 4| ATM modulates spliceosome mobilization and influences 
splicing decisions upon DNA damage. a, RNA synthesis measured by 5EU 
pulse labelling (n = 150, mean + s.e.m., t-test). Top, representative images 
obtained at 20X magnification. Bottom, graph of quantification (n = 150, 
mean + s.e.m., t-test). b-e, FRAP of splicing factors in quiescent HDFs (n = 25, 
mean + s.e.m., one-way ANOVA). b, Response to UV or DRB treatment. c— 
e, SNRNP40 response to: c, UV irradiation with or without ATM, ATR or 
DNA-PK inhibitors (ATMi, ATRi and DNA-PKi, respectively); d, UV or DRB 


participates in the selection of the genetic information contained in 
mature transcripts, thus revealing a novel non-canonical function of 
ATM in DDR. 


Spliceosome-ATM reciprocal regulation 

The ATM-dependency of splicing-factor-mobilization in quiescent 
cells indicates that UV irradiation activates ATM via a mechanism 
distinct from its canonical activation by replicative-stress- and ion- 
izing-radiation-inflicted DSBs*”*. UV irradiation of quiescent HDFs 
activated ATM, as evidenced by its auto-phosphorylation”® and phos- 
phorylation of CHK2” (Fig. 5a and Extended Data Fig. 7a-e) to levels 
similar to the topoisomerase I inhibitor camptothecin (CPT)** and the 
deacetylase inhibitor and non-canonical ATM activator, trichostatin 
A (TSA)° (Extended Data Fig. 7a). Notably, in UV-irradiated cells 
active ATM was dispersed throughout the nucleus, which contrasts 
to the focal accumulation triggered by DSB-inducing agents such as 
CPT or ionizing radiation (Extended Data Fig. 7e). Furthermore, in 
cells where ATR was also inhibited”, UV-dependent yH2AX and 
53BP1 foci were rare (Extended Data Fig. 7f), suggesting that in 
non-proliferating cells UV-dependent ATM activation occurs in the 
absence of DSBs. 

Impairments in co-transcriptional splicing promote hybridization of 
nascent RNA and single-stranded template DNA at the transcription 
bubble, resulting in three-nucleic-acid-strand structures known as 
R-loops”’. R-loops have been reported to cause genomic instability after 
splicing factor depletion®**® and activate ATM in both proliferating 
and post-mitotic cells***’. In agreement, siRNA-mediated silencing of 
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treatment with or without caffeine; e, UV irradiation of HDFs from an 
ataxia telangiectasia (AT) patient or a healthy donor (Ctrl). f, DRB- or 
UV-triggered and ATM-dependent intron inclusion assayed by RT-PCR in 
quiescent cells. Signal intensity expressed as unspliced:spliced ratio (n = 4, 
mean + s.d., one-way ANOVA). g, Genome-wide identification by RNA-seq 
of UV-induced alternative splicing (AS) events. Right, types of alternative 
splicing events; left, number of total and ATM-dependent events. *P < 0.05, 
**P< 0,01, ***P<0.001. 


U2 or U5 snRNP splicing factors, or combined RNase H1/H2A silen- 
cing, resulted in ATM activation in the absence of other treatments 
(Extended Data Fig. 8a, b, g). Similarly, treatment of quiescent cells with 
pladienolide B*’, which arrests late-stage spliceosomes and mobilizes 
U5, and to a lesser extent U2 snRNP splicing factors (Extended Data 
Fig. 8c), resulted in robust ATM activation (Extended Data Fig. 8d, e) 
and intron-retention levels comparable to UV irradiation (Extended 
Data Fig. 8f). To explain our observations we formulated the following 
hypothesis: RNAPII arrest at DNA lesions displaces a subset of splicing 
factors engaged in co-transcriptional splicing. Spliceosome displace- 
ment, in combination with negative supercoiling behind RNAPIL, facil- 
itates hybridization of naked pre-mRNA (still containing intronic 
sequences) to the DNA template strand. The resulting R-loop activates 
ATM, which then amplifies the mobilization signal and stimulates fur- 
ther spliceosome displacement either by promoting disassembly or pre- 
venting assembly of late-stage spliceosomes. Accordingly, we predicted 
that: (1) R-loops are formed at sites of UV-induced DNA damage; and 
(2) manipulation of R-loop levels will alter spliceosome mobility. 

To visualize and resolve R-loops in UV-irradiated cells we exploited 
the ability of RNaseH to bind and hydrolyse RNA at RNA-DNA 
duplexes*’. For indirect, real-time visualization of R-loops, we used 
HDFs stably expressing GFP-tagged RNaseH1(D145N), a binding- 
competent but catalytically inactive RNaseH1**. RNaseH1(D145N) 
was rapidly recruited to UVC microbeam-irradiated sites in a tran- 
scription-dependent but ATM-independent manner (Fig. 5b and 
Extended Data Fig. 9d), suggesting R-loop formation at DNA-damage 
sites. The ability of RNaseH1(D145N) to detect R-loops was confirmed 
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Figure 5 | Reciprocal regulation between spliceosome mobilization and 
R-loop-dependent ATM signalling. a, Immunofluorescence of ATM 
activation in quiescent HDFs. pCHK2 and pATM, phosphorylated CHK2 and 
ATM, respectively. b, Recruitment of RNaseH1(D145N)-GFP and mCherry- 
XPA at UVC microbeam irradiation sites (n = 10, mean + s.e.m., t-test). 

c-f, FRAP showing SNRNP40-GFP mobilization in: c, non-transfected and 
mCherry-RNaseH1-expressing U20S cells; d, after RNaseH1/H2A silencing; 
e, in quiescent HDFs treated with DRB and/or ionizing radiation (IR); f, after 
UV or CPT treatment. (c-f, n = 30, mean + s.e.m., one-way ANOVA). 

g, h, Intron retention assayed by RT-PCR in quiescent cells after combined IR/ 
DRB treatment (g) and silencing of RNaseH1/H2A (h). (g, h, n = 2, 

mean + s.d., one-way ANOVA.) i, Model of UV-triggered and R-loop/ATM- 
augmented spliceosome mobilization. a, b, c, Images were obtained at 40x 
(a) and 63X (b, c) magnification. *P < 0.05, **P < 0.01, ***P < 0.001. 


by overexpression of active RNaseH1 or by silencing of RNaseH2, 
which prevented or potentiated, respectively, recruitment of 
RNaseH1(D145N) at UVC microbeam-irradiation sites (Extended 
Data Fig. 9a). Formation of R-loops at these sites was verified using 
the $9.6 DNA-RNA-hybrid-specific antibody*’ (Extended Data Fig. 
9b). Silencing of RNaseH augments R-loop abundance and resulted in 
detectable immunofluorescence signals at nuclear areas irradiated with 
doses that normally do not elicit a detectable signal, thereby confirm- 
ing the S9.6 antibody specificity (Extended Data Fig. 9c). 

Overexpression of active RNaseH1 attenuated the UV-induced 
spliceosome mobilization to levels identical to ATM inhibition 
(Fig. 5c and Extended Data Fig. 9e). No additional effect was observed 
when the two manipulations were combined (Fig. 5c), suggesting that 
RNaseH1 mitigates the UV-triggered spliceosome mobilization by 
preventing ATM activation. Conversely, silencing of RNaseH1 and 
H2A, which resolve the majority of RNA-DNA duplexes within the 
cell*°**, results in ATM activation (Extended Data Fig. 8a, b) and 
augments the UV-triggered R-loop formation (Extended Data 
Fig. 9c), spliceosome mobilization (Fig. 5d and Extended Data 
Fig. 9f, g) and intron retention (Fig. 5h). 

ATM is required for a substantial fraction of the UV-triggered 
spliceosome mobilization. Regardless, ATM activation alone (for 
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example, as a result of ionizing radiation) does not influence spliceo- 
some mobility (Figs 3a, 5e and Extended Data Fig. 5b), indicating that 
ATM controls a positive feedback mechanism that enhances, but 
cannot trigger, spliceosome displacement (Fig. 5i). We hypothesized 
that UV-dependent transcription inhibition acts as the initiating 
mechanism for spliceosome mobilization, which is then enhanced 
by a secondary ATM-dependent signal. To test this, we used treat- 
ments (DRB and ionizing radiation) that each can specifically influ- 
ence one process; DRB inhibits transcription (Fig. 4a) but does not 
activate ATM (Extended Data Fig. 7a), while ionizing radiation acti- 
vates ATM (Extended Data Figs 7a, e and 8d, e) but does not interfere 
with global transcription (Extended Data Fig. 5a). Combination of 
DRB and ionizing radiation had additive effects in both spliceosome 
mobilization and intron retention (Fig. 5e, g and Extended Data 
Fig. 10a, b), indicating that ATM amplifies (but does not initiate) a 
mobilization signal imposed by transcriptional arrest. In agreement, 
treatment of quiescent HDFs with CPT, which promotes formation 
of transcription-blocking lesions (Extended Data Fig. 10d) and 
R-loop-dependent ATM activation” (Extended Data Fig. 7e), can also 
efficiently mobilize spliceosomes to levels higher than expected by 
transcription inhibition alone (Fig. 5f and Extended Data Fig. 10c). 


Discussion 


Here we present evidence that the core spliceosome is a target and an 
effector of the cellular response to transcription-blocking DNA 
damage, and we define a previously uncharacterized ATM-dependent 
branch of genome surveillance. Transcription-blocking DNA lesions 
cause selective chromatin displacement of late-stage spliceosomes by 
a two-step mechanism involving a stochastic (cis) and an ATM- 
signalling-mediated (trans) stage. Our hypothesis is that displace- 
ment of assembled co-transcriptional spliceosomes is required to 
remove steric inhibition that would otherwise prevent back-tracking 
(or removal) of RNAPII from DNA lesions, which is critical for sub- 
sequent DNA repair”'. The initial spliceosome displacement probably 
results in naked (intron-retaining) pre-mRNA readily available for 
hybridization with template single-strand DNA at the transcription 
bubble. This culminates in R-loop formation at damaged DNA sites, 
which in turn activate ATM. Previously, R-loop mediated ATM 
activation has been linked to replication-induced DSBs because of 
collision of arrested transcription complexes with the replication 
machinery*®*’. Here, we demonstrate that neither DSBs nor replica- 
tion are required for R-loop-dependent ATM activation. While the 
exact mode of UV-triggered ATM activation remains to be deter- 
mined, it does have significant biological consequences. It influences 
gene expression and plays a fundamental role in augmenting spliceo- 
some displacement and alternative pre-mRNA splicing genome-wide. 

ATM activation and spliceosome displacement are subject to recip- 
rocal regulation, which has two unanticipated implications. First, in 
response to transcription-blocking lesions, changes in spliceosome 
organization activate ATM signalling irrespective of replication. 
Second, ATM modulates DDR, not only by controlling expression 
levels of its target genes, but also by influencing pre-mRNA proces- 
sing. These observations provide new insights into the mechanisms 
and consequences of ATM activation in post-mitotic tissues, which is 
critical for proper cellular function, as evidenced by the severe neu- 
rodegeneration in ataxia telangiectasia patients™. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Materials. Micrococcal nuclease (MNase) and all chemicals were purchased from 
Sigma-Aldrich unless otherwise specified. DNA-modifying enzymes were from 
Roche Applied Sciences. Pladienolide B was from Santa Cruz Biotechnology, the 
ATR inhibitor VE821 from TINIB-Tools, and the ATM inhibitor KU55933 and 
DNA-PK inhibitor NU7441 from R&D Systems. Antibodies used were against: 
PRP8 (H300), XPA/p62 (FL-273), p89/XPB (S-19) and B-tubulin from Santa 
Cruz Biotechnology; SNRPC/U1C (NBP1-96048), NHP2L1 (NBP1-32732), 
SF3al (NB100-79847), SE3b2 (NV100-79843), RNaseH1 (NBP2-20171), and 
RNaseH2A (NBP1-76981) from Novus Biologicals; SNRNP40 (SAB2701506) 
and SRSF2/SC35 (clone SC-35) from Sigma; SNRPA/U1A (3F9-1F7) from 
ABGENT; PRPF3 (ab187535), RNPII CTD (phospho-S2) (ab5095), RNAPII 
(ab5095), PCNA (PC-10), Ki67 (ab833) from abcam; CPD (TDM-2) from 
MBL International; GFP (11 814 460 001) from Roche; H2A (07-146) from 
Millipore Corp.; phospho-ATM(1981)(05-740) from Upstate Biotechnology, 
phospho-CHK2(Thr68) (2661) from Cell Signaling. Anti-XPC (rabbit-polyclonal 
ab) was in-house developed. Odyssey-compatible IRDye680- and IRDye800- 
conjugated secondary antibodies were from LI-COR. Secondary antibodies 
conjugated to Alexa Fluorochromes-488, -568, -594 and -647 were from 
Invitrogen. GFP-tagged proteins were immunoprecipitated with GFP-Trap beads 
(ChromoTek). 

Cell culture, SILAC labelling and cell treatments. Cell lines used in this study 
were: ataxia telangiectasia patient (AT2)- and healthy adult donor (C5Ro)- 
derived human dermal fibroblasts (HDFs); SV40-transformed XPA (XP12RO), 
XPC (XP4A) and CSB (CS1AN) patient-derived HDFs; hTERT immortalized 
HDFs (C5Ro-T), VH10 human foreskin fibroblasts (VH10-T) and human retinal 
pigmented epithelial cells (RPE1, ATCC), human osteosarcoma cells (U20S, 
ATCC); and the amphotropic retroviral packaging cell line Gryphon A (Allele 
Biotechnology). Cells were subcultivated under standard culture conditions 
(37 °C, 5% CO) in a humidified incubator. U2OS, Gryphon A, and $V40-trans- 
formed cells were grown in Dulbecco’s Modified Eagle’s Medium (DMEM, 
Lonza), supplemented with 10% v/v fetal bovine serum (FBS, Fisher Scientific) 
and 1% v/v penicillin-streptomycin (PS, Lonza). Primary and TERT-immorta- 
lized HDFs and RPE-1 cells were cultured in Ham’s F10 (Lonza) supplemented 
with 15% FBS and 1% PS. When applicable, cells were synchronized in quiescence 
by 72 h serum-deprivation. For FRAP and immunofluorescence experiments, 
cells were seeded on 25-mm-diameter glass slides. For UVC laser/live-cell- 
imaging experiments, cells were seeded on quartz coverslips (010191T-AB; SPI 
Supplies). For stable isotopic labelling with amino-acids in culture (SILAC), 
C5Ro-T cells were cultured for>5 population doublings (PD) in lysine-, arginine- 
and leucine-free DMEM (AthenaES) supplemented with antibiotics, non-essen- 
tial amino-acids (Lonza), 10% dialysed FBS (Invitrogen) and 105 pig ml! leucine 
and either 73 pg ml! light ['*C,]lysine and 42 Lug ml! ['?C,,4N,Jarginine or 
with heavy ['°Cg]lysine and [°C,,'°N,Jarginine (Cambridge Isotope 
Laboratories). In each subcultivation, cell numbers were determined using a 
Beckman Z2 coulter counter (Beckman Coulter, Inc.), and 0.5 X 10* cells were 
seeded per cm* of growth surface area. The increase in population doubling 
(APD) was calculated using the formula APD = log; (number of cells har- 
vested/number of cells seeded)/log;o(2). 

Cells were UVC irradiated (254 nm, TUV Lamp, Philips) at the indicated 

doses. For local DNA damage infliction, cells were UV-irradiated (60 J m ”) 
through isopore polycarbonate membranes containing 5-j1m-diameter pores 
(Millipore). Chemicals were added directly in the growth media at the indicated 
concentrations. In FRAP experiments cells were assayed 1 h after initiation of 
treatment with the exception of illudin S and rotenone, which were assayed at 6 h. 
Pre-incubation with caffeine (10 mM), DDR-kinase inhibitors (10 uM) and 
MG132 (50 uM), started 1 h before genotoxic treatments and lasted throughout 
the experiment. «-Amanitin treatments were for >24 h. For exon-specific RT- 
PCR cells were lysed 6 h after treatment. 
Mass spectrometry and data analysis. Nanoflow liquid chromatography- 
tandem mass spectrometry (LC-MS/MS) and data analysis were as described”. 
In brief, samples containing MNase-digested chromatin were size-fractionated 
by SDS-PAGE, gels were cut in 2-mm slices, and subjected to dithiotreitol- 
reduction, iodoacetamide alkylation and trypsin digestion. LC-MS/MS was per- 
formed on an 1100 series capillary liquid chromatography system (Agilent 
Technologies) coupled to an LTQ-Orbitrap XL mass spectrometer (Thermo 
Scientific) operating in positive mode. Raw mass spectrometry data were analysed 
using the MaxQuant software. A false discovery rate of 0.01 for proteins and 
peptides and a minimum peptide length of six amino acids were set. The 
Andromeda search engine was used to search MS/MS spectra against the 
International Protein Index (IPI) human database. Statistical analysis was 
performed with Perseus (1.5.0.30)”. 
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Cloning. Human full-length cDNA clones used for subcloning were; PRP8/ 
PRPF8/DHX16 (CS116070), SF3A1 ($C321295), SNRPN40 (SC112670) and 
RNaseH1 (SC319446) from Origene and U1A/SNRPA (MHS6278-202826119), 
NHP2L1 (MHS6278-202839330) and PRP3/PRPF3 (MHS6278-202826220) 
from Dharmacon. To generate vectors expressing GFP- and mCherry-tagged 
proteins the open reading frames (minus the stop codon) of human U1A, 
SF3al1, PRP3, NHP2L1, PRP8 and SNRNP40 were PCR amplified using oligonu- 
cleotides containing restriction enzyme sites. PCR products were subcloned into a 
pLHCxX retroviral expression vector (Clontech Laboratories) modified to contain 
eGFP lacking the initiation codon. XPA and RNaseH1 lacking the mitochondrial 
localization signal (amino acids 1-28) were subcloned in modified pLHCX vec- 
tors containing either eGFP or mCherry lacking their stop codons. PCR ampli- 
fications were performed on a MJ Scientific, Inc., PTC-100 Thermocycler using 
high-fidelity Phusion polymerase (Bioke). Amplified cDNAs were purified using 
the Promega Wizard kit. Following restriction digestion of inserts and vectors, 
shrimp alkaline phosphatase treatment of the vectors, and agarose gel electro- 
phoresis, the gel-excised DNAs were purified using the Promega Wizard kit. 
DNA inserts were ligated into vectors at a 3:1 molar ratio. Plasmid DNAs were 
validated by restriction digestion and sequencing. 

Infections/transfections. C5Ro-T, C5Ro, AT-2, U2OS and VH-10T cell lines 
stably expressing GFP-tagged proteins were generated by retroviral infection 
followed by hygromycin selection. For retrovirus production Gryphon A cells 
were transfected with the appropriate expression vector using FuGENE 6 (Roche) 
according to the manufacturer’s instructions. Viral supernatants were harvested 
48 h post-transfection, filtered through 0.45-1m filters (Millipore Corp.) and used 
immediately to infect subconfluent cell cultures in the presence of 5 ug ml! 
polybrene. U2OS cells were transiently transfected with RNaseH1-mCherry 
(pLHCX) using FuGENE 6. For gene silencing the following siRNAs were 
purchased from Thermo Scientific as SMARTpools: ON-TARGETplus 
Human RNaseH2A siRNA (L-003535-01-0005) targeting the catalytic subunit 
A of RNaseH2, ON-TARGETplus RNaseH1 siRNA (L-012595-01-0005), On- 
TARGETplus PRP8 siRNA (L-012252-01-0005), On-TARGETplus SF3al 
siRNA (L-016051-01-0005), On-TARGETplus ATM siRNA (L-003201-00- 
00005) and a control/scrambled siRNA duplex (D-001210-05-05). For gene silen- 
cing, U2OS cells were transfected with RNAiMAX (Invitrogen), and C5SRoT and 
RPE cells with HiPerfect (Qiagen), as recommended by the manufacturers. To 
inhibit endogenous RNaseH activity cells were transfected with a (1:1) mixture of 
siRNAs targeting RNaseH1 and RNaseH2A. 

Preparation of whole cell lysates, chromatin fractionation, and immunopre- 
cipitations. Whole-cell lysates were prepared by lysis of equal cell numbers in 
60 mM Tris-Cl (pH 6.8), 2% SDS, 10% glycerol, 5% $-mercaptoethanol and 
0.01% bromophenol blue. Crude chromatin was isolated after Triton-X 100 
extraction and MNase digestion. All fractionation steps were performed at 
4°C. Cell pellets were suspended in a non-denaturing isosmotic buffer (10 mM 
PIPES (pH 7.0), 3 mM MgCl, 100 mM NaCl, 300 mM sucrose, 0.5 mM Na,VOu,, 
5 mM NaF, 5 mM NayP20;, 10 mM -glycerolphosphate, 0.1 mM PMSF, 1 mM 
EGTA, 1X EDTA-free protease inhibitor cocktail (Roche), 15M MG132, 
10 mM N-ethylmaleimide and 20 uM PR-619 (LifeSensors)) and extracted in 
the same buffer with 0.5% (v/v) Triton-X 100 for 5 min. Following centrifugation 
(650g, 5 min), nuclei depleted from soluble nucleoplasm were washed with 
MNase digestion buffer (50 mM Tris-Cl (pH 7.5), 4 mM MgCl, 50 mM KCl, 
300 mM sucrose, 0.5 mM Na,VO,, 5 mM NaF, 5 mM Na,P,O,7, 10 mM 
B-glycerolphosphate, 1 mM PMSF, 1 mM EGTA and 1X EDTA-free protease 
inhibitor cocktail) and subsequently incubated with 0.3 U MNase (Sigma)/1 X 10° 
nuclei, and 1 mM CaCl, (37°C, 10 min). Addition of (NH4)2SO, to a final 
concentration of 250 mM was used to facilitate extraction of stably DNA-bound 
proteins. EGTA and EDTA were added to 5 mM and samples were centrifuged at 
16,000g for 20 min. Protein concentrations were determined using a modified 
Bradford method (Bio-Rad). For GFP immunoprecipitations, cells were lysed in 
20 mM Tris-Cl (pH 7.5), 5 mM MgCl, 150 mM NaCl, 0.5% Triton X-100, 1X 
phosphatase inhibitor (Roche) and 1X protease inhibitor cocktail. Chromatin 
was mechanically sheared by passing through a 27G syringe, 40 times. Particulate 
matter was removed by centrifugation (20 min at 16,000g) and supernatants 
containing equal amounts of proteins were used for immunoprecipitation. 
GFP-tagged proteins were immunoprecipitated directly or after MNase digestion 
which was used to cleave DNA and RNA and disrupt ternary complexes. Samples 
were incubated (2 h, 4°C) with pre-equilibrated GFP-Trap coupled to agarose 
beads (ChromoTek), and after extensive washing (10mM Tris-Cl (pH 7.5), 
150 mM NaCl, 0.5 mM EDTA, 0.5% NP-40), immunocomplexes were dissociated 
from the beads by heating for 10 min at 95 °C, in 120 mM Tris-Cl (pH 6.8), 4% 
SDS, 20% glycerol, 10% B-mercaptoethanol, 0.01% bromophenol blue. For 
immunoprecipitation of elongating RNAPII, cells were treated and extracted as 
for isolation of crude chromatin with the exception that instead of MNase 
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digestion, chromatin was mechanically sheared. Immunoprecipitations were per- 
formed by O/N incubation with either the anti-RNAPII CTD phospho-Ser2 
antibody or rabbit IgG, followed by incubation with protein A/protein G agarose 
beads (Upstate Biotechnology). 

Immunoblotting. Protein samples were size-fractionated on 5-20% gradient 
SDS-polyacrylamide gels (BioRad) and electro-transferred onto nitrocellulose 
membranes using a Bio-Rad Mini-Protean electrophoresis system. Abundance 
of proteins of interest was assayed using antibodies at concentrations recom- 
mended by their manufacturers. Membranes were incubated with primary anti- 
bodies in Tween 20/Tris-buffered saline (20 mM Tris (pH 7.4), 150 mM NaCl, 
0.1% Tween 20) containing 3% w/v non-fat dry milk or, when the pATM 
antibody was used, 3% BSA. Following binding of the appropriate anti-mouse 
or anti-rabbit Alexa Fluorochrome-conjugated secondary antibody and extensive 
washing, proteins of interest were visualized using the Odyssey CLx Infrared 
Imaging System (LI-COR Biosciences). Signal intensities were quantified using 
the ImageQuant TL software (GE Healthcare Life Sciences). 

RNA synthesis. Transcription levels were determined following 2-hour incuba- 
tion with ethynyluridine (EU) added directly in the culture (serum-free) media. 
EU incorporation was visualized using Click-iT conjugation of AlexaFluor647 
(Invitrogen) according to the manufacturer’s protocol. Images were obtained 
using a Zeiss Axio Imager Z2 upright laser-scanning confocal microscope 
equipped with a 63X Plan-Apochromat 1.4 NA oil-immersion lens (Carl Zeiss 
Inc.) Fluorescence-signal intensities were quantified using the Image] software 
(NIH). In each experiment >150 cells per condition were analysed. 
Immunofluorescence and live-cell confocal laser-scanning microscopy. For 
immunofluoresence experiments, cells were fixed with 3.7% paraformaldehyde 
(PFA)/PBS and permeabilized in 0.5% Triton-X 100/PBS. For detection of splic- 
ing factors and XPC, cells were pre-extracted with 0.5% Triton-X 100/PBS before 
fixation (0.5 min). For CPD immunodetection, nuclear DNA was denatured with 
0.07 M NaOH for 5 min. For SRSF2/SC35 immunodetection cells fixed in 2% 
PFA/0.2% Triton-X 100/PBS were treated with 100% acetone (5 min,—20 °C). 
Non-specific antigens were blocked in 3% BSA/PBS. R-loop immunodetection 
with the S9.6 antibody was as described previously”. In brief, PFA-fixed cells were 
permeabilized by Triton X-100, followed by extraction with 0.5% SDS. Cells were 
blocked with 3% BSA, 0.1% Tween 20 in 4 X Saline Sodium Citrate (SSC) buffer. 
Hybridization of primary antibodies was overnight at 4°C, and with secondary 
Alexa Fluorochrome-conjugated antibodies for 1h at room temperature. 
Coverslips were mounted on glass slides using 4,6-diamidino-2-phenylindole 
(DAPI)-containing ProLong Gold antifade reagent (Molecular probes) and 
imaged on a Zeiss Axio Imager Z2 upright laser-scanning confocal microscope. 

Live-cell-imaging experiments were performed with a Leica TCS SP5 AOBS 
laser scanning confocal microscope equipped with an environmental chamber 
(37 °C, 5% COz). Kinetic studies of GFP-tagged proteins were performed using 
UVC (266 nm)-laser-irradiation for local DNA damage infliction’. In brief, a 
2 mW pulsed (7.8 kHz) diode-pumped solid-state laser emitting at 266 nm (Rapp 
OptoElectronic) was connected to the confocal microscope with an Axiovert 
200M housing adapted for UV by all-quartz optics. By focusing the UVC laser 
inside cell nuclei without scanning, only a limited area within the nucleus (dif- 
fraction limited spot) was irradiated. Cells were imaged and irradiated through a 
100X, 1.2 NA Ultrafluar quartz objective lens. Images obtained before and after 
UVC laser irradiation were analysed using the LASAF software (Leica). 
Fluorescence intensity in the irradiated area or a non-irradiated area in the 
nucleus was normalized to levels in the same area before irradiation. Data were 
expressed as the percentage change in relative fluorescence intensity. In each 
experiment at least ten cells were analysed and all experiments were performed 
a minimum of three times. 

Mobility of GFP-tagged proteins was measured by strip-FRAP as described". 
In brief, a narrow (~1 pm) strip spanning the width of the nucleus was photo- 
bleached at ~20% of the initial GFP-signal intensity using a 488 nm laser at 100% 
power. Recovery of fluorescence in the strip was monitored at 25-ms intervals. 
Images obtained were analysed using the LASAF software (Leica). FRAP data 
were normalized to the fluorescence levels before photobleaching after subtrac- 
tion of the background signal. In each experiment 8-10 cells per condition were 
analysed and all experiments were performed at least three times. A negative 
(untreated) and positive (20 J cm * UV) control were included in all experiments. 
Chromatin-associated RNA isolation and snRNA qPCR. Chromatin- 
associated RNA was isolated by a modification of the method developed by 
Wuarin and Schibler* for the isolation of ternary-complex-associated nascent 
RNA. Briefly, cell pellets were re-suspended in 20 mM HEPES (pH 7.5), 10 mM 
KCl, 250 mM sucrose, 5 mM MgCl, 1 mM EGTA, 1 mM PMSF, 1 pl ml 
RNasin (Invitrogen), 1X phosphatase inhibitor (PhosStop, Roche) and 1X 
protease inhibitor cocktail (Roche), and lysed by the addition of Digitonin 


to 200 pg ml”! final concentration (10 min, 4°C). Nuclei were pelleted by 
centrifugation (650g, 5 min) and following re-suspension in a buffer containing 
20 mM Tris-HCL (pH 7.5), 75 mM NaCl, 0.5 mM EGTA, 50% glycerol, 1 mM 
PMSF, 1 ul ml! RNasin, 1X protease and 1X phosphatase inhibitors, were 
extracted for 10 min at 4 °C by the addition of ten volumes of a solution contain- 
ing 20 mM HEPES (pH 7.6), 7.5 mM MgCh, 0.2 mM EGTA, 300 mM NaCl, 1 M 
urea and 1% NP-40. Pelleted nuclei were re-suspended in Qiazol reagent (Qiagen) 
and RNA was isolated as recommended by the manufacturer. Following digestion 
with DNase I (Qiagen) RNA was cleaned up with the Qiagen RNeasy mini-kit 
with a second, on-column, DNase I digestion. Equal amounts of RNA from each 
sample were reverse-transcribed using random-hexamers and SuperScript III 
(Invitrogen). qPCR reactions were performed using primers complementary to 
human snRNAs (described and validated by Galiveti et al.) or the chromatin- 
associated HotAir ncRNA (control for data normalization), using the IQQYBR 
Green Supermix (Bio-Rad) in a CFX96 Touch Real-Time PCR Detection System 
(Bio-Rad). Absence of contaminating genomic DNA was verified by the lack of 
amplified products for all sample/primer sets by inclusion of mock reverse-tran- 
scription reactions in which no enzyme was added. 

Exon-specific RT-PCR: RNA extraction, reverse transcription and PCR. 
Experiments were performed as described by Ahn et al.”*. In brief, RNA was 
isolated from quiescent RPE-1 cells using the RNeasy kit (QIAGEN). Equal 
RNA amounts from each sample were reverse transcribed using random hexam- 
ers and SuperScript III RT and cDNAs were PCR-amplified using the indicated 
primers and Taq DNA Polymerase (NEB). PCR products were size-fractionated 
by gel electrophoresis and visualized by ethidium bromide staining. Signal intens- 
ities of amplified fragments containing either the unspliced or the spliced intron 
were normalized to the levels of the respective fragments in untreated cells and 
expressed as fold change in relative abundance. All experiments were repeated a 
minimum of three times. Amplification of constitutive exons from TUBA1B and 
GAPDH were used as controls for general splicing efficiency. Primers sets used 
for amplifications were: FANCG (exon5-6) 5’-GGATGTCCTCCTGACAGC 
AT-3' and 5’-GCTGTGTACACCTGGACCAA-3’; AKT1 (exon11-12) 5'-AC 
AAGGACGGGCACATTAAG-3’and —5'-ACCGCACATCATCTCGTACA-3’; 
AURKA (exon9-10) 5’-AATGATTGAAGGTCGGATGC-3’ and 5'-TCTGGC 
TGGGATTATGCTTC-3’; AURKB (exon6-7) 5'-TGCAGAAGAGCTGCACAT 
TT-3’and 5'-TCTTCAGCTCTCCCTTGAGC-3’; TUBAIB (exon2-3) 5’-CC 
GGGCTGTGTTTGTAGACT-3’ and 5’-GATCTCCTTGCCAATGGTGT-3’; 
ATM (exon19-20) 5’-AAGGAGCTTCCTGGAGAAGAG-3’ and 5’-AACT 
GTCCTTGAGCATCCCTT-3’; ATR (exon33-34) 5’-AAGGAGCCTATCCT 
GGCTCTC-3’ and 5'-CTACCCTGGCACTCTGCAGCC-3’, 

Paired-end RNA-seq and data analysis. RPE-1 cells were grown to confluence 
and serum-deprived for 72 h to ensure quiescence. Cells were then mock-treated 
or UV-irradiated with 20 J m 7 UVC (245 nm), in the presence or absence of 
10 UM of the ATM inhibitor. Each treatment was performed in duplicate plates 
that were used as biological replicates. RNA was isolated 6 h post-irradiation with 
the RNeasy kit (QIAGEN) and PolyA RNA was isolated using the Dynabeads 
mRNA purification kit (Invitrogen). Sample integrity was verified by the Agilent 
2100 Bioanalyzer (Agilent Technologies). For all samples the Bioanalyzer RNA 
integrity scores (RIN) were 9.3-10, indicating excellent RNA quality. For each 
sample a cDNA library was prepared and validated using the Illumina TruSeq 
RNA sample preparation kit v2 according to the manufacturer’s instructions. In 
brief, equal amounts (200 ng) of poly(A)-RNA were chemically fragmented, 
cDNA was generated using random hexamers as primers and adapters were 
ligated. RNA fragmentation efficiency and similarity between samples was con- 
firmed by the Bioanalyzer after adaptor ligation (average fragment sizes were 
317-344 bp). Following PCR amplification RNA-seq was performed according 
to the Illumina TruSeq v3 protocol on the HiSeq2500 platform, generating 
paired-end, 100-bp reads (9 X 10’ reads per sample). Raw reads were aligned 
against the human genome assembly (hg19) using TopHat**. Uniquely mapped 
reads were used for the identification of alternative splicing events using multi- 
variate analysis of transcript splicing (MATS) as previously described®. 
Alternative splicing events that were significantly increased (P< 0.05, n= 2) 
by UV irradiation (untreated versus UV-irradiated cells) by a minimum of 
10% difference were considered to be UV-induced. Each individual alternative 
splicing event identified in the first analysis to be UV-induced that also decreased 
by a minimum of 10% in UV-irradiated cells in absence of ATM activity (UV- 
irradiated cells versus ATM-inhibitor-treated/UV-irradiated cells) was consid- 
ered to be (partly) dependent on ATM activity. 

Statistical analysis. All data presented were reproduced in at least three inde- 
pendent experiments. Statistical analysis was performed using the PRIZM 
GraphPad software unless otherwise stated. Significance of differences was eval- 
uated with either Student’s t-test, when only two groups were compared, or 
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one-way ANOVA for more than two groups. No statistical methods were used to 
predetermine sample size. One-way ANOVA was followed by post hoc analysis 
either by Dunnett’s test (for comparison of experimental conditions to control) or 
Bonferroni’s test (comparison between groups). *P<0.05, **P<0.01 and 
***P <0,001. Proteomic data statistical analysis was performed with Perseus 
(1.5.0.30)’”. Significance B was calculated by estimating the variance of the dis- 
tribution of all protein ratios, taking into account the dependency of the distri- 
bution on the summed protein intensity’’. Peptides with a significance B value 
P=0.05 in either the forward or reverse experiment were considered significant 
and indicated by (+) in Extended Data Fig. 1d and Supplementary Table 1. 
Significant alternative splicing events were identified by MATS”; only UV-trig- 
gered events with P< 0.05 were used for further analysis. 


39. Schwertman, P. et a/. UV-sensitive syndrome protein UVSSA recruits USP7 to 
regulate transcription-coupled repair. Nature Genet. 44, 598-602 (2012). 


ARTICLE 


. Nakazawa, Y., Yamashita, S., Lehmann, A. R. & Ogi, T. A semi-automated non- 


radioactive system for measuring recovery of RNA synthesis and unscheduled 
DNA synthesis using ethynyluracil derivatives. DNA Repair (Amst.) 9, 506-516 


(2010). 


. Houtsmuller, A. B. & Vermeulen, W. Macromolecu 


ar dynamics in living cell nuclei 


revealed by fluorescence redistribution after photobleaching. Histochem. Cell Biol. 


115, 13-21 (2001). 


RNA polymerase II: evidence for cotranscriptiona 
7219-7225 (1994). 


Application of housekeeping npocRNAs for quanti 
human transcriptome by real-time PCR. RNA 16, 


. Wuarin, J. & Schibler, U. Physical isolation of nascent RNA chains transcribed by 


splicing. Mol. Cell. Biol. 14, 


. Galiveti, C. R., Rozhdestvensky, T.S., Brosius, J., Lehrach, H. & Konthur, Z. 


ative expression analysis of 
450-461 (2010). 


. Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: d 


RNA-Seq. Bioinformatics 25, 1105-1111 (2009). 


iscovering splice junctions with 


. Shen, S. et a/. MATS: a Bayesian framework for flexible detection of differential 


alternative splicing from RNA-Seq data. Nucleic Acids Res. 40, e61 (2012). 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a SILAC-based quantitative proteomics c NER-protein abundance d Chromatin Proteomics 
Human Dermal Fibroblasts (CSRo-TERT) CIN. CHR’ P snRNP Protein Fold change 
v = v UV(Jim2): 0 20 0 20 0 20 nae CL OMat EY 
[Forward] [Reverse] xe > (SSS = Ut SNRNP70—«0.9472 
Heavy Media LightMedia | Heavy Media Light Media 7. U1 U1A 0.99487 
1 J 4 L XPA > i Ut UIC ND 
Serum Serum Serum Serum - U2 SF3A1 0.71816 + 
deprivation deprivation deprivation deprivation PCNA>>—= — — — — U2 SF3B1 0.78634. + 
+ L = ; 
= U2 SF3B14 ND 
H2A > ———— 
Kis 5 mace Fis u2 SF3B2 0.74446 + 
Chromatin Chromatin Chromatin Chromatin CIN: Cytoplasm/Nucleoplasm U2 SF3B3 ND 
CHR: MNase digested chromatin U2 SF3B4 ND 
Mix equal protein amounts Mix equal protein amounts P: Pellet U2 SF3B5 0.64266 + 
U2 SNRPB2 0.68718 + 
LC MS/MS LC MS/MS U4 PPIH 0.98799 
b 5 5 U4 NHP2L1 0.89354 
Cell fractionation for crude chromatin preparation 
Ss cise ‘ US EFTUD2 0.70443 + 
Cell harvesting in isotonic buffer 
US SNRNP200 0.70876 + 
Lysis of plasma membrane and detergent-extraction of nuclear proteins US DDX23 ND 
(cytoplasm /nucleoplasm) US PRP6 0.77102 + 
L U5 PRP8 0.69738 + 
Low speed centrifugation to obtain depleted nuclei U5 SNRNP4O 0.73072 + 


MNase digestion to mononucleosomes and Ammonium Sulfate extraction 


Centrifugation 
Pellet Chromatin 


SF abundance in total cell lysates 


U1A SF3a1 SF3b2 PRP8 SNRNP40 
+ 
U1 U2 U5 


Extended Data Figure 1 | Chromatin association of splicing factors. 

a, Schematic overview of the proteomic experiments for the identification of 
proteins that display UV-dependant chromatin association. b, Schematic 
outline of cell fractionation. c, Validation of the chromatin-isolation protocol 
for NER proteins that are recruited to chromatin in response to DNA damage. 
Mock-treated or UV-irradiated quiescent HDFs (20 J m ?, 1 h post- 
irradiation) were fractionated as outlined in b. Equal protein amounts from 
each fraction were analysed by immunoblotting using antibodies against the 
indicated NER proteins. Abundance of H2A is shown as a control for 
chromatin-isolation efficiency. d, UV-triggered changes in chromatin 
association of core splicing factors, identified by quantitative SILAC 
proteomics. Proteomic experiments were performed with HDFs as outlined in 
a. The table lists representative examples of splicing factors that participate 
in distinct snRNP complexes and their chromatin association in response to 
UV irradiation (20 J m~*, 1 h). U2 and U5 snRNP splicing factors show 
significantly reduced chromatin association (P = 0.05, significance B’’) and are 
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indicated with a cross. Significance B was calculated by estimating the variance 
of the distribution of all protein ratios, taking into account the dependency 
of the distribution on the summed protein intensity'’. ND, not detected. 

e, Abundance of splicing factors in total cell lysates. Total lysates were prepared 
from U20S cells that were mock-treated or UV-irradiated (20J m7, 1h post- 
irradiation) and splicing factor abundance was assayed by immunobloting. 
Abundance of H2A is shown as a loading control. Right, immunoblots; left, 
quantification of splicing factor signal intensities normalized to H2A (n = 3, 
mean + s.d., one-way ANOVA/Bonferroni). f, UV-dependent interaction of 
splicing proteins with elongating RNAPII. Quiescent HDFs were prepared 

as outlined in b except that, instead of MNase digestion, chromatin was 
mechanically sheared. Elongating RNAPII was immunoprecipitated with 

an antibody that recognizes specifically the Ser2-phosphorylated RNAPII 
C-terminal domain (CTD) and its interaction with the U2 snRNP splicing 
factors SF3al and SF3b2 was assayed by immunoblotting. 
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Extended Data Figure 2 | Validation of HDFs stably expressing GFP-tagged 
splicing factors. a, Whole-cell lysates from HDFs stably expressing eGFP 
tagged PRP8, SF3al, SNRNP40 or free eGFP, were analysed by 
immunoblotting using antibodies against GFP (left) or against PRP8, SF3al 
and SNRNP40 (right). Ectopically produced proteins were expressed at near or 
below endogenous levels. b, Fluorescent microscopy images of GFP-tagged 
splicing factors showing the expected punctuated nuclear distribution. Images 
were obtained at 40X magnification. c, Localization of SNRNP40-GFP in 
nuclear speckles which were visualized by immunofluorescence detection of the 
speckle marker SRSF2/SC35. Images were obtained at 63 magnification. 

d, Interaction of SNRNP40-GFP with endogenous splicing factors and 
elongating RNAPII. Quiescent SNRNP40-GFP-expressing HDFs were mock- 
treated or irradiated with 20 J m7 UVC. After a 3-h recovery period, cells 
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were lysed under native conditions and chromatin was sheared by mechanical 
force. SNRNP40-GFP was immunoprecipitated from whole-cell lysates 

using GFP-Trap agarose beads, and its association with endogenous splicing 
factors and the large subunit of RNAPII was assayed by immunoblotting. 
Non-transfected cells are shown as a negative control. SNRNP40-GFP interacts 
with U2 and U5 snRNP components, suggesting that the GFP tag does not 
interfere with complex formation. Interaction of SNRNP40 with its U5 snRNP 
partner PRP8 is partially maintained even after MNase digestion, consistent 
with its presence in U4/U6.U5 tri-snRNP complexes. Participation of 
SNRNP40-GFP in co-transcriptional splicing complexes is confirmed by 
co-immunoprecipitation of the active (hyperphosphorylated RNAPIIo) large 
subunit of RNAPII. 
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Extended Data Figure 3 | Displacement of mature spliceosomes from 
subnuclear sites of UV-inflicted DNA damage. a, U2OS cells stably 
expressing GFP-tagged splicing factors were UV-irradiated (60 J m ~~) through 
isopore membranes resulting in DNA lesion formation in small subnuclear 
areas. DNA damage sites (circled) were visualized by immunofluorescence 
using an antibody against the NER recognition factor XPC. Scale bar, 5 um. 
b, SF3a1-GFP and PRP8-GFP depletion from UVC laser microbeam 
irradiation sites. Quantification of 20 cells from two independent experiments. 
eGFP localization at sites of DNA damage is used to demonstrate that depletion 
of eGFP-tagged splicing factors is not caused by photobleaching. c, UVC laser 
microbeam irradiation results in preferential displacement of U2- and U5- 
associated splicing factors from DNA damage sites. Quiescent HDFs were 
irradiated in a ~1-y1m-diameter nuclear area via a UVC laser. GFP signal 


intensity, reflecting the abundance of GFP-tagged U1, U2, U4 and U5 snRNP 
components at UVC DNA damage sites, was quantified in the irradiated 
and in a non-irradiated nuclear area (undamaged control). Plotted is the 
fluorescence signal intensity expressed as a percentage of that before 
irradiation, at the 1-min time point. Cells expressing free eGFP were used as 
negative control. Representative from three independent experiment (n = 12, 
mean + s.e.m., paired f-test). d, Depletion of splicing factors from UVC laser 
irradiation sites depends on active transcription. Transcription initiation was 
inhibited in quiescent HDFs by prolonged a-amanitin treatment (10 uM, 
=24h) before subnuclear UVC laser irradiation. Plotted is the SNRNP40-—GFP 
abundance in irradiated and non-irradiated nuclear areas at 1-min post- 
irradiation. Representative from three independent experiments (n = 12, 
mean + s.e.m., one-way ANOVA/Bonferroni). 
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Extended Data Figure 4 | SNRNP40 reorganization and speckle 1h post UVC irradiation with 20 J m’”. a, Live cells. b, Fixed cells. Images were 

enlargement in response to UV irradiation. Representative microscopic obtained at 63 magnification. 


images showing SNRNP40-GFP distribution in quiescent HDFs before, and 
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Extended Data Figure 5 | Transcription stalling mobilizes spliceosomes 
independent from NER complex assembly and proteasome activity. a, RNA 
synthesis is inhibited preferentially by genotoxins that inflict bulky DNA 
lesions. Influence of genotoxins on RNA synthesis of quiescent HDFs was 
measured by 5EU pulse labelling combined with click chemistry. Top, 
representative images; bottom, quantification of fluorescence intensity 

(n = 150, mean + s.e.m., one-way ANOVA/Bonferroni). Images were obtained 
at 40X magnification. b, Mobilization of U2 and U5 snRNPs by genotoxins 
inflicting transcription-blocking DNA lesions. Mobilization of GFP-tagged 
SF3a1 (left) and PRP8 (right) assayed by FRAP in quiescent HDFs exposed to 
different types of genotoxins (n = 30, mean + s.e.m., one-way ANOVA/ 
Dunnett’s). IR, ionizing radiation. c, Chromatin displacement of mature 
spliceosomes is not TC-NER-dependent. Left, chromatin abundance of U2 and 
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U5 snRNP splicing factors was assayed by immunoblotting in XPA-deficient 
(left), XPA-GFP-corrected (middle) and CSB-deficient (right) HDFs. Cells 
were UV-irradiated (20 J m *) and chromatin was isolated at the indicated 
times. Top, immunoblots; bottom, quantification of splicing factor signal 
intensities normalized to H2A (n = 3, mean =~ s.d., one-way ANOVA/ 
Bonferroni). d, Proteasome activity is not required for UV-damage-induced 
spliceosome mobilization. Mobilization of SNRNP40-GFP assayed by FRAP 
in quiescent HDFs exposed to UV radiation in the presence or absence of the 
proteasome inhibitor MG132 (50 uM) (n = 30, mean = s.e.m., f-test). 

e, SNRNP40-GFP mobilization by transcription inhibition. FRAP of 
SNRNP40-GFP in quiescent HDFS after inhibition of transcription initiation 
(10 pg ml’ o-amanitin, 24 h) or elongation (1 pg ml’ actinomycin D or 
50 uM DRB, 1 h) (n = 30, mean + s.e.m., one-way ANOVA/Dunnett’s). 
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Extended Data Figure 6 | ATM-dependency of UV induced spliceosome 
mobilization, alternative splicing and gene expression changes. a, UV 
irradiation and DRB-dependent mobilization of SNRNP40. Quiescent HDFs 
expressing SNRNP40-GFP were UV-irradiated or DRB treated with doses that 
inhibit transcription to similar levels. Splicing factor mobility was assayed by 
FRAP. b, Additive effect of combined UV and DRB treatments. FRAP of 
SNRNP40-GFP in quiescent HDFs treated with DRB, UV, or a combination of 
both, each at a dose that inhibits RNA synthesis by ~50%. c, Impaired UV- 
dependent SF3al mobilization in cells lacking ATM activity. SF3a1-GFP 
mobilization was measured by FRAP in quiescent HDFs derived from an ataxia 
telangiectasia (AT) patient or a healthy donor. d, ATM-dependent spliceosome 
mobilization. Quiescent HDFs were treated with 10 1M ATM (KU55933), ATR 
(VE821) or DNA-PK (NU7441) inhibitors before irradiation. GFP-tagged 
SF3al1 or PRP8 mobility was assayed by FRAP. ATM, but not ATR or DNA-PK 
inhibition partially prevented the UV-induced splicing factor mobilization. 
a-d, n = 25, mean + s.e.m., one-way ANOVA/Bonferroni. e, Reduced UV- 
induced intron retention in response to ATM silencing. Intron inclusion in 
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retina pigment epithelium (RPE) cells transfected either with control or ATM- 
silencing siRNAs and subsequently mock-treated or UV-irradiated (20 J m~*, 
6 h) was assayed by RT-PCR. f, ATM-dependent changes in intron retention. 
Intron inclusion was assayed by RT-PCR in untreated, UV-irradiated and 
DRB-treated quiescent cells in the presence or absence of 10 4M ATM 
inhibitor. g, Heat map of UV-triggered and ATM-dependent transcriptome 
changes. Quiescent cells were mock-treated or UV-irradiated in the presence or 
absence of the ATM inhibitor. Transcriptome profiles were generated by RNA- 
seq. Differentially expressed genes between untreated and UV-irradiated cells 
(P< 0.05) and UV-irradiated cells in the presence or absence of the ATM 
inhibitor (P< 0.05), were clustered in a heat map using Pearson correlation. 
n = 1,676 differentially expressed transcripts. The observed correlation 
indicates that UV-inducible transcriptome changes can be, in part, prevented 
by ATM inhibition. h, Lack of influence of ATM inhibition on DRB-dependent 
splicing factor mobility. Splicing factor mobility was measured by FRAP in 
untreated or DRB-treated HDFs in the presence or absence of 10 4M ATM 
inhibitor (n = 30, mean + s.e.m., one-way ANOVA/Bonferroni). 
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Extended Data Figure 7 | Canonical and non-canonical ATM activation. 

a, ATM autophosphorylation (Ser1981) was assayed in quiescent HDFs 1 h 
after the indicated treatments. In non-replicating cells UV and trichostatin A 
(TSA) activate ATM via non-canonical pathways. Transcription inhibition by 
DRB has no influence on ATM activity. b, The quiescent status of serum- 
deprived HDFs was verified by immunodetection of the cell cycle marker Ki67, 
which is not expressed by quiescent (Go) cells. c, Immunofluorescence 
detection of active ATM in quiescent HDFs treated with DDR kinase inhibitors. 
d, Immunoblotting analysis of nuclear extracts derived from quiescent HDFs 
treated as in c using a phospho-specific ATM (Ser1981) antibody (top) and an 
antibody recognizing ATM (bottom). e, Differences in autophosphorylated- 
ATM distribution in quiescent HDFs treated with various ATM activators. Left, 
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multiple cells; right, single-cell magnification illustrating pan-nuclear 
localization of phosphorylated ATM after UV irradiation and focal 
accumulation after CPT or ionizing radiation treatments. Magnified cells are 
indicated by arrows (left panel). f, Differences in amounts of DNA damage-foci 
formation indicative of DSBs, in response to CPT, UV and ionizing radiation. 
Quiescent HDFs were pre-treated with the ATR inhibitor (10 1M, 1 h) and 
subsequently exposed to the indicated genotoxins. DSB foci were visualized 
by immunofluorescence using antibodies against YH2AX and p53BP1. Left, 
multiple cells; right, single-cell magnification; images were obtained at 40X and 
63X magnification, respectively. Magnified cells are indicated by arrows in the 
left panel. 
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Extended Data Figure 8 | ATM activation by interference with spliceosome 
assembly or RNaseH1/H2A silencing. a, ATM autophosphorylation was 
assayed by immunofluorescence in HDFs after silencing of SF3a1, PRP8 or 
combined silencing of RNaseH1 and RNaseH2A. b, Immunoblotting analysis 
of silenced proteins in total cell lysates. Tubulin is shown as a loading control. 
c, Splicing factor mobilization by the spliceosome inhibitor pladienolide B (PL) 
was assayed by FRAP in quiescent HDFs. Consistent with its function in 
interfering with spliceosome maturation following pre-spliceosome assembly, 
cell treatment with pladienolide B resulted in extensive mobilization of U5 
snRNP factors (PRP8 and SNRNP40), partial mobilization of the U2 snRNP 
SF3al, and had no influence on the U1 snRNP factor U1A (n = 30, 
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mean + s.e.m., one-way ANOVA/Bonferroni). d, e, ATM activation by 
Pladienolide B. Quiescent HDFs were either treated with 5 uM pladienolide B 
or exposed to 1 Gy ionizing radiation (IR) and autophosphorylated ATM was 
detected by immunofluorescence (d) or immunoblotting (e). f, Effect of 
pladienolide treatment on intron retention. RNA isolated from mock-treated, 
UV-irradiated or pladienolide B-treated RPE cells. Intron retention, as assayed 
by RT-PCR on transcripts of the indicated genes, shows that pladienolide B 
influences splicing to the same extent as UV irradiation. U/S, ratio of relative 
abundance of unspliced (U) to spliced (S) introns. g, Efficiency of RNaseH1 
and H2A silencing at the single-cell level, assayed by immunofluorescence. 
Images were obtained at 40 magnification. 
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Extended Data Figure 9 | UV-induced R-loop formation enhances 
spliceosome mobilization. a, Recruitment of RNaseH1(D145N)-GFP at local 
DNA damage sites depends on endogenous levels of RNaseH activity. DNA 
damage was inflicted via a UVC laser in ~1-j1m-diameter subnuclear areas of 
cells after silencing of RNaseH2A or overexpression of RNaseH1-mCherry. 
Recruitment of RNaseH1(D145N)-GFP at the irradiated sites was monitored 
by live-cell imaging. Plotted is the fluorescence intensity of RNaseH1(D145N)- 
GFP at 1 min post-irradiation, at the irradiated and in a non-irradiated nuclear 
area. Representative from three independent experiments (n = 10, 

mean + s.e.m., one-way ANOVA/Bonferroni). b, c, R-loop formation at sites of 
local UVC laser irradiation. Immunofluorescence detection of R-loops using 
the DNA-RNA hybrid-specific $9.6 antibody. Sites of irradiation are visualized 
by XPC immunodetection. b, Dashed boxes indicate the magnified areas shown 
in the right panels. The dashed lines indicate the line-scan track used to 
quantify fluorescence intensity of $9.6 and anti-XPC (shown in in the graph). 
c, Specificity of the antibody was confirmed by its increased sensitivity after 
RNaseH2A silencing and its ability to detect R-loops when suboptimal doses of 
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UVC irradiation were applied. d, RNaseH1 accumulation at local DNA damage 
sites depends on active transcription but not ATM activity. Transcription 
initiation was inhibited in quiescent HDFs by a-amanitin (10 pg ml’, 24 h) 
before local UVC laser irradiation. Plotted is the fluorescence intensity at 1 min 
post-irradiation of RNaseH1(D145N)-GFP at the irradiated area and ina 
non-irradiated nuclear area for untreated, ATM-inhibitor- and «-amanitin- 
treated cells. Representative from three experiments (n = 10, mean + s.e.m., 
one-way ANOVA/Bonferroni). e, RNaseH1 overexpression inhibits the 
UV-dependent spliceosome mobilization. FRAP of U2OS cells stably 
expressing GFP-tagged SF3al and PRP8 and transiently transfected with 
RNaseH1-mCherry. f, RNaseH1 and H2A silencing potentiates the UV- 
dependent spliceosome mobilization. RNaseH1 and H2 were silenced in U2OS 
cells expressing SF3a1-GFP or PRP8-GFP and splicing factor mobility was 
assayed by FRAP. g, FRAP of SNRNP40-GFP in quiescent HDFs after 
RNaseH1/H2 silencing. e-g, n = 30, mean + s.e.m., one-way ANOVA/ 
Bonferroni. 
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Extended Data Figure 10 | Combined transcription inhibition and ATM 
activation results in extensive mobilization of mature spliceosomes. 

a, Combinatorial effect of DRB and ionizing radiation on spliceosome 
mobilization. Quiescent HDFs were exposed to ionizing radiation in the 
presence or absence of DRB, and SF3al-GFP and PRP8-GFP mobility was 
assayed by FRAP. b, The ionizing-radiation-mediated increase of DRB- 
dependent spliceosome mobilization depends on ATM activity. FRAP of GFP- 
tagged SNRNP40 in quiescent HDFs treated with DRB and/or ionizing 
radiation in the presence or absence of an ATM inibitor. c, Spliceosome 
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mobilization by CPT. Quiescent HDFs were treated with 25 pg ml‘ CPT, 
25 uM DRB and 20 J m * UV at doses that inhibit transcription to 
approximately 30% and their influence on SF3a1, PRP8 and SNRNP40 
mobilization was measured by FRAP. Mobilization of GFP-tagged SF3a1, PRP8 
and SNRNP40 in quiescent HDFs was measured by FRAP. a-c, n = 30, 
mean + s.e.m., one-way ANOVA/Bonferroni. d, Inhibition of RNA synthesis 
by the treatments shown in c was assayed in quiescent HDFs by 5EU 
incorporation and click chemistry (7 = 150, mean + s.e.m., one-way 
ANOVA/Dunnett’s). 
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Self-similar energetics in large clusters of galaxies 


Francesco Miniati' & Andrey Beresnyak* 


Massive galaxy clusters are filled with a hot, turbulent and 
magnetized intra-cluster medium. Still forming under the action 
of gravitational instability, they grow in mass by accretion of 
supersonic flows. These flows partially dissipate into heat through 
a complex network of large-scale shocks’, while residual transonic 
(near-sonic) flows create giant turbulent eddies and cascades”. 
Turbulence heats the intra-cluster medium* and also amplifies 
magnetic energy by way of dynamo action®*. However, the 
pattern regulating the transformation of gravitational energy into 
kinetic, thermal, turbulent and magnetic energies remains 
unknown. Here we report that the energy components of the 
intra-cluster medium are ordered according to a permanent hier- 
archy, in which the ratio of thermal to turbulent to magnetic 
energy densities remains virtually unaltered throughout the clus- 
ter’s history, despite evolution of each individual component and 
the drive towards equipartition of the turbulent dynamo. This 
result revolves around the approximately constant efficiency of 
turbulence generation from the gravitational energy that is freed 
during mass accretion, revealed by our computational model of 
cosmological structure formation*”. The permanent character of 
this hierarchy reflects yet another type of self-similarity in cos- 
mology’®’, while its structure, consistent with current data’*'*, 
encodes information about the efficiency of turbulent heating and 
dynamo action. 

The computational model we use captures the turbulent motions 
through a multi-scale technique that employs six nested grids cover- 
ing progressively larger volumes with correspondingly coarser reso- 
lution elements*”’. The finest grid resolves the virial volume of the 
galaxy cluster (GC) with more than a billion uniformly-sized reso- 
lution elements, and provides the necessary dynamic range to 
resolve the intra-cluster medium (ICM) turbulent cascade. The lar- 
gest grid covers the chosen cosmological volume of 340 co-moving 
Mpc on a side (‘co-moving’ indicates that the system is taking part 
in the expansion of the universe; 1 Mpc ~ 3 X 10° light years). The 
intermediate grids allow us to account, with adequate accuracy, for 
the tidal forces produced by the matter distribution outside the GC 
volume. The calculation starts with three grids and adds progres- 
sively finer grids as the Lagrangian volume of the GC shrinks under 
self-gravity. All six grids are in place at a time corresponding to 
8 Gyr after the Big Bang. At the current time, 13.8 Gyr after the 
Big Bang, the simulated GC has a total virial mass of 1.3 X 10° 
solar masses (Mo). 

Figure 1 shows a snapshot of the simulation, illustrating the cos- 
mological context and the highly-turbulent conditions of the flow 
inside the GC volume. The fine resolution across the GC volume allows 
us to accurately measure the time-dependent statistical properties of 
structure-formation-driven ICM turbulence including, in particular, 
the dissipation rate, €,,,,, the outer scale, L, and the dispersion of the 


velocity increment on that scale, (( but) ) Me (Methods). In the follow- 
ing, we restrict our analysis to a region within Ry;,/3 ~ 1 co-moving 
Mpge, which is most relevant for comparison with observations, where 
the virial radius R,;, characterizes a volume that has nominally reached 
dynamical equilibrium (Methods). 


Hydrodynamic turbulence is dominated by the solenoidal compon- 
ent, which accounts for 60-90% of the total kinetic energy*”®. Detailed 
analysis shows that this component remains statistically homogenous 
and isotropic thus resembling Kolmogorov’s cascade, despite the pres- 
ence of considerable structure in the ICM’. The dissipation of incom- 
pressible turbulence contributes to ICM heating* along with shocks 
and adiabatic compression, and to the growth of magnetic energy by 
way of small-scale dynamo action** (see Extended Data Fig. 1 for 
cascade details). The turbulent dissipation rate associated with the 
solenoidal component is estimated from the numerical simulation 
data. Because ICM turbulence is driven by various complex hydrodyn- 
amic mechanisms ultimately powered by the unsteady mass accretion 
process’, the dissipation rate is highly changeable with time and exhi- 
bits non-monotonic variations of more than one order of magnitude” 
(Fig. 2c). 

However, alongside much complexity, turbulent dissipation 
appears to also exhibit simplicity of behaviour. This is shown in 
Fig. 2a, which illustrates the temporal evolution of the fraction of 
thermal energy originating from turbulent dissipation, 74,,. In con- 
trast to €tu:p, this quantity remains remarkably constant during the GC 
lifetime, tub ~ 0.3-0.4, indicating that the efficiency of turbulence 
generation out of gravitational energy freed by mass accretion is 
approximately constant. In addition, Fig. 2b shows that the turbulence 
velocity dispersion at the outer scale normalized to the ICM sound 
speed (c,), that is, the turbulence Mach number M tury = ((du 1’) uA [Css 
also remains almost constant with time. This is certainly related to the 
fact that the GC remains dynamically young, that is, it continues to 
accrete mass and grow in size, but also shows that in the ICM the 
evolution of the turbulent kinetic energy and the thermal energy are 
closely related, consistent with Fig. 2a. The value of Miu» can be 
understood as follows. If the generation of the bulk of the thermal 
energy, E,,, is dominated by the last « eddy turnover times, 


TL =L/6uz, then En = Neath | pecs a [2/37 Meu) 0((Suz)”)s 
where p is the mass density and we have used the known relation 


Eturb = (2/ 3c)/ ?((6u,)°)"”? /L with the coefficient of the correlation 
function C ~ 2 (ref. 21). It is straightforward to then see that the Mach 
number Mtub=(V3/)"/?( nur /0.37)/” which, for « = 1.5-3, ranges 
between 0.8 and 1.2, confirming the result in Fig. 2. It also follows that 
the ratio of thermal to turbulent kinetic energy is 


En 20 ai aif 
ee gaa =y 1 
1((du,)”) 33/2 Neurb Turd ( ) 


Generation of magnetic field by small-scale dynamo processes in 
a turbulent flow follows from standard theory. In a high Reynolds num- 
ber (Re > 10°) flow such as the ICM, an initial seed of vanishing 
strength’ is amplified exponentially at the rate [= /Re/30t,. 
After a short while (oc Re™"””), the magnetic field stops growing below 
the characteristic Alfvén scale, La =v, /C°  esurhs Where Vv, =B /J4np 
is the Alfvén speed and B the magnetic field strength, due to the 
back-reaction of magnetic tension. Nevertheless, magnetic energy 
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Figure 1 | High-resolution simulation of a galaxy cluster in fully 
cosmological context. a, b, Baryonic gas in the large-scale structure of the 
universe (a; bright is high, dark is low), and a zoom-in on the GC centre where 
numerical resolution is highest (b; colour-map inverted). Dynamic range of 
density (in cm”) is ~10°. The black dashed line marks the virial radius, Ryir, 


continues to grow at the expense of turbulent kinetic energy as La, 
which marks the equipartition scale between kinetic and magnetic 
energy, shifts towards larger values’. Growth, however, is now pro- 
portional to the turbulent dissipation rate instead of being expo- 
nential with time. It is in this latter stage that the dynamo spends 
most of the time®. Recent state-of-the-art numerical work finds that 
for statistically isotropic and homogeneous turbulence, as found for 
the solenoidal component of the ICM””®, the efficiency of the con- 
version of turbulent (kinetic) to magnetic energy is a universal 
number, Cz = 4-5% (ref. 8). 

Therefore, the evolution of magnetic energy in the ICM can be 
expressed in terms of the turbulence dissipation history as 


t 
x(t) =B° /8n=Ce | dtpe(t). Combined with the above finding 


about “up, this leads to simple but significant expressions relating 
the fundamental properties of magnetic field and turbulence in the 
ICM. In fact, since turbulence dissipation contributes a constant 
fraction, Mu ~ 1/3, of the ICM thermal energy, the ratio Bplasma 
of thermal pressure, P,,,, to magnetic energy can be written as 


_ Peas Nea(Y — D) Neurb ot Cay 
P ieena = 22 = =40 ( (2) 
B?/8n Cr 1/3 0.05 

This means that for a massive GC, plasma is a constant, which depends 
neither on the specifics of the ICM conditions including turbulence, 
nor on the GC mass or age. It is instead simply determined by two 
fundamental parameters, Cy and "yup, Which describe the efficiency of 
magnetic field generation by the turbulent dynamo and of turbulent 
heating in structure formation, respectively. This is shown in Fig. 3a, 
where Plasma is plotted as a function of cosmic time and exhibits 25% 
r.m.s. fluctuations, which should also characterize massive cluster-to- 
cluster variations. We can also compute the Alfven scale. Since the 
turbulence is non-stationary and the magnetic properties respond ona 
timescale of the order of 1-2 eddy turnover times”? we average the 
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Temperature 


enclosing the volume that has nominally reached dynamical equilibrium. 

c, Vorticity magnitude on a scale twice the finest mesh size (~20 kpc). 

d, Temperature map. Complexity is due to shocks and contact discontinuities 
in the turbulent flow. Dynamic range of temperature (in K) and vorticity 

(in Hj +) is ~10°. 


dissipation rate €,,,, over 2 Gyr when calculating L,. Expressing L, in 
units of the turbulence outer scale we write 


3 


3/2 
La _ vA 3 : CS 
L C3/2 (eup) L VB seston ((6uy)? 


38 Dsinseta =e Merb = 
100 \ 40 1 


We have already shown that both Ppiasma and the turbulence Mach 
number remain constant during the evolution of a massive and dynam- 
ically young GC. Therefore, for such GCs, the Alfvén scale too remains 
a constant fraction of the turbulence driving scale, independent of time, 
GC mass and ICM conditions. In addition, given the large value of 
Bplasma and that Miu ~ 1, La is small compared to L. The temporal 
evolution of L4/L is shown in Fig. 3b (see Extended Data Fig. 2 for a 
typical ICM spectrum of hydromagnetic turbulence). Finally, Fig. 3c 
shows that the evolution of the turbulent injection scale, L, closely 
follows that of La while tracking the growing characteristic scale of 
the GC (Rso0 = 0.5Ryir) also plotted in the same panel. Note that the 
modulation of L, reflects the changing turbulent conditions in the ICM 
and, in particular, is anti-correlated with €1», as generally expected. 
The large value of Bpiasma indicates that magnetic energy, like tur- 
bulent energy, is small compared to thermal energy (Ei, > Ep). 
Moreover, the small value of L/L indicates that the dynamo is far 
from saturation, and magnetic energy is also small in comparison 
to the turbulent kinetic energy (Etu >> Eg). This energy hierarchy 
is fundamentally due to the efficiency mu, with which turbulent 
energy is generated during gravitational collapse and the fraction 
Cy, thereof that is converted into magnetic energy, namely 
Evn:Evurb‘Ep = 1:fturb!Cetturp. The values of Bplasma and La are in good 
agreement with recent measurements of magnetic field properties 
in GCs'*""*. Here, they emerge from pure numerical modelling of 


(3) 
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Figure 2 | Temporal evolution of turbulence. a, ur, the ratio of ICM 
t 


thermal energy contributed by turbulent dissipation, | dtpétup, to the total 


0 
thermal energy, E,,. b, Turbulent Mach number, the ratio of the turbulent r.m.s. 


velocity (Suz) ) 1? to the sound speed c,. c, Volumetric turbulent dissipation 
rate, €urp (ref. 20). Error bars, variance of €,y:p. All quantities are computed 
within 1/3 of the virial radius. Time in Gyr is reported on the bottom x axis, and 
the cosmological redshift on the top x axis. 
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Figure 3 | Temporal evolution of magnetic field. a, Bptasma = Peas! (B’/8n), 
t 
the ratio of ICM thermal to magnetic pressure, where B /8n=Cr | dtpéturb(T). 


b, L,/L, the ratio of Alfvén scale to the turbulent injection scale. La(t) = 

v3 /[C?/? (eur) where va =B/V4n is the Alfvén speed and (éturp) is the 
turbulent dissipation rate smoothed over t = 2 Gyr with a Gaussian filter. 
Quantities in a and b refer to a volume inside 1/3 the virial radius. The dashed 
lines show transients to the asymptotic regime for an artificial toa. = 4.5 Gyr. 
c, Turbulence injection scale L (solid black line) and the characteristic cluster size 
Rsoo = 0.5 Ryir (dashed line), enclosing a mass over-density of 500. Time in Gyr 
(bottom x axis) and the cosmological redshift (top x axis) are reported. 
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structure formation turbulence and MHD dynamo action, in the 
sense that they are found to derive their values from the parameters 
Nturb and Cz, which are determined numerically and not through para- 
metric fits. Furthermore, the above energy hierarchy appears to remain 
unchanged during the evolution of the GC, and the turbulent dynamo 
in the ICM is as far from saturation today as it has virtually always been 
in the past. Figure 3c shows that the GC size and, therefore, its mass 
constantly grow. This implies that the gravitational potential energy 
and, therefore, the ICM thermal energy and turbulent energy also 
continue to grow. Meanwhile dynamo action tries to bring magnetic 
and turbulent energy into equipartition. Since all of these forms of 
energy grow simultaneously at the expense of the same gravitational 
energy, but with different constant efficiencies, their ratio remains 
unchanged, reflecting the value of those intrinsic efficiencies. 

In other words, both plasma and La/L encode the efficiency of 
turbulence generation in structure formation and the efficiency of 
dynamo action. Therefore, they allow us to relate magnetic field obser- 
vations in massive GCs to such properties of structure formation. This 
is in sharp contrast with other astrophysical bodies***°—for example, 
the interstellar medium of galaxies, stars and compact objects—where 
the turbulence dynamo has long saturated and such information has 
been lost forever. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Numerical model. The simulation is carried out with CHARM, an adaptive- 
mesh-refinement cosmological code’’. This code uses a directionally un-split vari- 
ant of the piecewise parabolic method for hydrodynamics”’, a constrained-trans- 
port algorithm for solenoidal MHD”, a time-centred modified symplectic scheme 
for the collisionless dark matter, and solves Poisson’s equation with a second-order 
accurate discretization. The magnetic field remains negligible throughout, so the 
calculation is effectively hydrodynamic. For massive galaxy clusters, such as 
the Coma cluster, the ICM cooling time is a few times the age of the Universe”’, 
so cooling and baryonic feedback processes are neglected. Heating of the intergal- 
actic medium through photoionization is also neglected, with no consequences 
whatsoever for the generation of vorticity and turbulence at accretion shocks. 
We use a concordance A-CDM universe with normalized (in units of the critical 
value) total mass density, 2,, = 0.2792, baryonic mass density, 2, = 0.0462, 
vacuum energy density, Qq = 1— Qm = 0.7208, normalized Hubble constant 
h = H,/100kms_' Mpc ' = 0.701, spectral index of primordial perturbation, 
n, = 0.96, and r.m.s. linear density fluctuation within a sphere with a co-moving 
radius of 8h"! Mpc, og = 0.817 (ref. 34). The simulated volume has co-moving size 
of Lgox = 240h"' Mpc on a side. The initial conditions are generated on three 
refinement levels with grafict+ + (made publicly available by D. Potter). For the 
coarsest level we use 512° co-moving cells, corresponding to a nominal spatial 
resolution of 468.75h ' co-moving kpc and 512’ particles of mass 6.7 X 10° ‘Mo 
to represent the collisionless dark matter component. The additional levels allow for 
refined initial conditions in the volume where the galaxy cluster forms. The refine- 
ment ratio for both levels is ne = Ax; /Axp41 =2, €=0, 1. Each refined level covers 
1/8 of the volume of the next coarsest level with a uniform grid of 512° co-moving 
cells while the dark matter is represented with 512° particles. At the finest level 
the spatial resolution is Ax = 117.2h"' co-moving kpc and the particle mass is 
10°" 'Mo. As the Lagrangian volume of the galaxy cluster shrinks under self- 
gravity, three additional uniform grids covering 1/8 of the volume of the next coars- 
est level are employed with 512°, 1,024° and 1,024? co-moving cells, respectively, and 
ne =2,4, 2, for ¢ = 2, 3, 4, respectively. All of them are in place by redshift 1.4, 
providing a spatial resolution of 7.3h-'co-moving kpc in a region of 7.5h"' Mpc, 
accommodating the whole virial volume of the GC. The ensuing dynamic range of 
resolved spatial scales is sufficiently large for the emergence of turbulence. 

GC characteristic quantities. The GC and its formation history are reconstructed 
using our implementation of a HOP halo finder** and merger history code. The 
virial radius is defined as the region enclosing a mass over-density 4, = 178Q°° 
with respect to the critical density*®. At redshift z = 0, the virial radius is Ry;, = 
1.95h"! Mpc, and the corresponding enclosed mass, Myir = 1.27 X 10'° Mo. Also 
at z = 0, using 4, = 500 we find the characteristic radius R599 ~ 1h"! Mpc. 
Turbulence characterization. The characteristic quantities describing the tur- 
bulence are inferred from the analysis of the structure functions. This analysis is 


LETTER 


described in detail elsewhere*’. Basically, we decompose the velocity into a 
solenoidal and a compressional component using a Hodge-Helmoltz decomposi- 


tion, that is 
= | ax A | ax (4) 
4nj r 4n r 


V=Vs+V, Vo=—VO, V=VXA, o 


and then we compute the second and third order structure functions of velocity 
increments of the solenoidal component, dv; =[v,(x+ 1) — vs(x)]; (ref. 21) 


Si(D) = ((dvi)?) (5) 


where p = 2, 3 indicates the structure function order, and i indicates the projection 
along or perpendicular to I, for the longitudinal and transverse structure functions, 
respectively. To compute the structure functions we define sampling points ran- 
domly distributed inside the volume of interest (within 1/3 of the virial radius), and 
compute the velocity difference with respect to other randomly selected field 
points at a maximum distance of two virial radii. Once the velocity structure 
functions are computed, we define the velocity dispersion as half the asymptotic 
values of the second order structure function, and the outer scale as the separation 
at which that asymptotic value is reached. To compute the Mach number we 


divide the turbulent velocity dispersions by the sound speed, c, = \/'P/p, com- 
puted by evaluating the mean value of each thermodynamic quantity within the 
same volume in which the sampling points are collected. Finally, the turbulent 
dissipation rate is computed by identifying the inertial range of the second and 
third order structure functions of the solenoidal velocity increments (for details 
see ref. 20). 

Code availability. We have opted not to make the code available for practical 
reasons. However, the methods we adopt are published in the literature and are 
commonly used in the community. Amongst others, the publications mentioned 
in the above Methods section contain tests of our code against problems with 
known solutions and also with respect to solutions obtained with similar codes 
from independent authors. 
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Diagram of ICM Turbulent Cascade 
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Extended Data Figure 1 | Generation and cascade of ICM hydromagnetic 
turbulence. First the gravitational potential energy is converted into kinetic 
energy of accretion flows. These generate shear and shocks which, in addition to 
dissipation, produce fluid instabilities and a baroclinic term, respectively, 
leading to turbulent flows. Shocks also accelerate particles via the Fermi I 
mechanism. Shocks do not dissipate tangential flows, which will either generate 


kinetic energy 


Femi! _» relativistic particles 


heat 


relativistic particles 


turbulence, shear or shocks, or a combination thereof. The turbulence cascade 
includes dissipation of compressible modes at weak shocks, conversion of 
turbulent to magnetic energy via dynamo action, excitation of plasma waves 
accelerating relativistic particles via Fermi II mechanism, and viscous 
dissipation. 
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spectrum of ICM Hydromagnetic Turbulent Cascade 


Energy containing scale: 
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Kolmogorov’s scale: 
tSLdiss 


k=1/Rvir kK=1/L$1/Lozmidov k=1/La k=1/£diss log(k) 


Extended Data Figure 2 | Spectrum of ICM hydromagnetic turbulent Ozmidov’s scale, Lo, the Alfven scale, L,, and Kolmogorov’s dissipation scale, 
cascade. Characteristic spectrum of turbulent kinetic energy in the ICM. Solid js. All quantities are time dependent and Ozmidov’s scale is comparable to 
and dashed lines correspond to the solenoidal (Kolmogorov-like) and the the injection scale, so at times turbulence in the radial direction could be 


compressional (Burgers-like) velocity field, respectively. On the x axis, from left suppressed by stratification. 
to right, we have marked the virial scale, R,i,, the injection scale, L, the 
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Large heterogeneities in comet 67P as revealed by 
active pits from sinkhole collapse 


Jean-Baptiste Vincent', Dennis Bodewits?, Sébastien Besse’, Holger Sierks!, Cesare Barbieri‘, Philippe Lamy°, Rafael Rodrigo®’, 
Detlef Koschny’, Hans Rickman*®’, Horst Uwe Keller’, Jessica Agarwal’, Michael F. A’Hearn”"", Anne-Thérése Auger’, 


Nicolas Thomas!’, Imre Toth** & Cecilia Tubiana! 


Pits have been observed on many cometary nuclei mapped by 
spacecraft'*. It has been argued that cometary pits are a signature 
of endogenic activity, rather than impact craters such as those on 
planetary and asteroid surfaces. Impact experiments”* and models”* 
cannot reproduce the shapes of most of the observed cometary pits, 
and the predicted collision rates imply that few of the pits are related 
to impacts*”. Alternative mechanisms like explosive activity’? have 
been suggested, but the driving process remains unknown. Here we 
report that pits on comet 67P/Churyumov-Gerasimenko are active, 
and probably created by a sinkhole process, possibly accompanied 
by outbursts. We argue that after formation, pits expand slowly 
in diameter, owing to sublimation-driven retreat of the walls. 
Therefore, pits characterize how eroded the surface is: a fresh come- 
tary surface will have a ragged structure with many pits, while an 
evolved surface will look smoother. The size and spatial distribution 
of pits imply that large heterogeneities exist in the physical, struc- 
tural or compositional properties of the first few hundred metres 
below the current nucleus surface. 

Understanding the differences in local activity of comet nuclei helps 
us to constrain how their surfaces have evolved since their formation. 
From July to December 2014, the OSIRIS (Optical, Spectroscopic, and 
Infrared Remote Imaging System) cameras on board Rosetta’! continu- 
ously monitored the activity of comet 67P/Churyumov-Gerasimenko 
(referred to, hereafter, as comet 67P) from about a 30 km distance from 
the surface of the nucleus and resolved the fine structure of dust jets’. 
By means of stereo reconstruction, we found that broad jets can be 
separated into narrower features, which are linked unambiguously to 


M. Antonella Barucci'”, Jean-Loup Bertaux'’, Ivano Bertini!*, Claire Capanna°, Gabriele Cremonese’”, Vania Da Deppo’*, 
Bjorn Davidsson®, Stefano Debei!’, Mariolino De Cecco'®, Mohamed Ramy El-Maarry”’, Francesca Ferri'*, Sonia Fornasier'”, 
Marco Fulle”°, Robert Gaskell”, Lorenza Giacomini’, Olivier Groussin®, Aurélie Guilbert-Lepoutre’, P. Gutierrez-Marques', 
Pedro J. Gutiérrez”, Carsten Giittler’, Nick Hoekzema', Sebastian Héfner', Stubbe F. Hviid 
Jorg Knollenberg”’, Gabor Kovacs', Rainer Kramm!, Ekkehard Kiihrt?’, Michael Ktippers”», Fiorangela La Forgia*, Luisa M. Lara?’, 
Monica Lazzarin*, Vicky Lee**, Cédric Leyrat!*, Zhong-Yi Lin*, José J. Lopez Moreno’, Stephen Lowry”®, Sara Magrin?’, 

Lucie Maquet”, Simone Marchi*®, Francesco Marzari~’, Matteo Massironi?’, Harald Michalik®°, Richard Moissl”°, 

Stefano Mottola’, Giampiero Naletto'+°*!, Nilda Oklay!, Maurizio Pajola’*, Frank Preusker?*, Frank Scholten??, 


23 Wing-Huen Ip™, Laurent Jorda®, 


quasi-circular depressions and to walls of alcoves that are a few tens to a 
few hundreds of metres in diameter. These pits are remarkably sym- 
metric and similar in size, and show interesting morphological details 
such as horizontal layers and terraces, vertical striations, and a smooth 
floor seemingly covered with dust. Some of these pits are as deep as a few 
hundred metres and provide a glimpse well below the nucleus surface. 
We detected a set of 18 quasi-circular pits on the northern hemisphere 
of comet 67P (Extended Data Table 1, Fig. 1). We observed that pits 
tend to cluster in small groups, and that several pits are active (Fig. 2). 
We measured the depth-to-diameter ratio (d/D) of the pits and found 
that active pits have a high d/D = 0.73 + 0.08, while pits that are cur- 
rently inactive are much shallower with mean d/D = 0.26 + 0.08 
(Extended Data Table 1, Fig. 3). The d/D ratio of these active pits is 
much higher than that of circular depressions on other comets: 
d/D=0.1 on comet 9P/Tempel 1 (ref. 4), and d/D=0.2 on comet 
81P/Wild 2 (refs 13 and 14). 

The difference in pit morphology on the three comets may reflect 
their different histories. For Jupiter family comets, the time since the 
last encounter with Jupiter is a proxy for the thermal history of the 
surface. Comet 9P is considered to be more processed by sublimation 
than comet 81P (ref. 3). In that view, comet 67P is relatively unpro- 
cessed by sublimation because its perihelion was brought from 2.7 
astronomical units (AU) to 1.2 au by a close encounter with Jupiter 
in 1959 (see Methods subsection ‘Orbit integration’). Comet 81P is also 
considered a young comet, but its pitted terrains are exposed to the Sun 
at perihelion and so have experienced much stronger erosion than the 
pitted areas on comet 67P even though it has spent less time in the 
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Figure 1 | Location of the pits considered in this study. A non-exhaustive catalogue of depressions sharing similar morphologies to those unambiguously linked 


to jets in the Seth and Ma’at regions. 


inner Solar System. Deep active pits on comets are seemingly found 
preferentially on surfaces that have not been notably eroded. 

The terrain morphology inside the pits on comet 67P is not uniform 
and is classified as: very smooth texture; fractured terrain, terraces and 
alcoves; or globular texture. The globular texture is detected only in the 
deep pits and at a few additional locations on the nucleus, where deeper 
near-surface layers can be observed. This morphology extends to a 
depth of at least 200m below the current nucleus surface (see, for 
example, pit Seth_01, Extended Data Fig. 2). 


Jets arise from the edges of active pits, primarily from heavily 
fractured and globular morphologies (Fig. 2). However, the d/D 
ratio cannot be explained by current sublimation-driven retreat 
of the walls. Excavating a pit like Seth_01 by sublimating ice on 
the wall and floor would take more than 7,000 years (Methods). 
The cylindrical shape of most pits also provides evidence against 
formation by erosion, because this would result in elongated 
shapes and a latitudinal dependence of the pit distribution on 
the surface. 


Figure 2 | Jet-like features in the Seth region. a—d, Views of the main active 
pit in the Seth region, at different angles of solar illumination. The illuminated 
area of the pit is the south wall (a), the north wall (b), the east wall (c) and the 
southeast wall (d). Blue arrows point to detected jets; red arrows indicate areas 
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where no activity could be observed, either from the walls or from their 
surroundings. The left images are the original data; the right images are linearly 
stretched in brightness to display the lowest 5% of the intensity values. 
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Figure 3 | Depth-to-diameter ratio as a function of pit diameter. Filled 
symbols describe active pits; empty symbols describe currently inactive pits. 
Filled circles are active pits in the Seth region; filled squares are active pits in the 
Ma’at region. The lower value of d/D for pits in the latter might indicate a 
different formation mechanism. Error bars represent the uncertainties inherent 
to the shape reconstruction technique (stereo-photogrammetry) used to 
produce the digital terrain model of the comet’’. 


The 380 pits observed on comet 9P have been associated with 
explosive activity’. In the weeks before its encounter with this comet, 
the spacecraft Deep Impact observed at least 10 outbursts, the largest of 
which ejected an estimated (6-30) X 10*kg of material’®!™"°, The 
observations suggest that these outbursts originated from a series of 
pits located in a belt around the nucleus. At 4.11 Au from the Sun, on 
30 April 2014, OSIRIS observed an outburst on comet 67P (ref. 17). 
Depending on the assumed size distribution of the ejected dust, the 
resulting plume contained 10°-10° kg of material, and was thus of 
similar magnitude to the outbursts observed on comet 9P. Such out- 
bursts are too small to create the observed pits by explosive excavation. 
Assuming a constant density’? of 470kgm_°, a typical large active pit 
on comet 67P would have contained approximately 10” kg of material, 
10* times more than the upper limit on the mass of the material 
excavated by the observed outburst. 

We propose that the pits are formed via sinkhole collapse, when the 
ceiling of a subsurface cavity becomes too thin to support itself (Fig. 4 
and Methods). Because the size of sinkholes depends on the material 
strength of the top layers, sinkholes in a given terrain are all of similar 
size. They are characterized by circular depressions aligned with the 
local gravity vector’®. 

On cometary nuclei, the removal of subsurface volatiles may gen- 
erate a void. Failure of the cavity’s ceiling propagates upward. From the 
observed pit diameters and depths, and by treating the cavity’s roof as 
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an unsupported beam failing under its own weight, we estimate that 
the collapsing layer has an average tensile strength of 50 Pa (Extended 
Data Fig. 8 and Methods). This value is similar to the lower-limit 
estimate based on overhangs on the surface’’. The collapse exposes 
fresh material in the walls of the pit, which sublimates to produce the 
observed jets. Such collapse may very well be the driver of the 30 April 
2014 outburst from comet 67P and the mini-outbursts from comet 9P. 
The morphology and expansion of the dust plume of the 30 April 2014 
outburst from comet 67P suggest that most of the activity arose from 
an area within 30° of latitude of the north pole’’, compatible with the 
location of the pits in the Seth region. 

The collapse itself is a sudden event, but the cavity 100-200 m below 
the surface could have been growing over a much longer timescale. We 
explore three cavity formation scenarios: (1) primordial voids inher- 
ited from formation; (2) direct sublimation of super volatiles (CO and 
CO,) as an evolutionary process; and (3) deep subsurface sublimation 
triggered by a secondary source of energy. 

(1) The primordial scenario implies that voids existed in the nucleus 
since its formation. This is possible if the comet formed by slow accre- 
tion of cometesimals of tens to hundreds of metres in size. Low col- 
lision speeds would prevent crushing the cavities”. A weakening of the 
surface due to direct sublimation would trigger roof collapse. 

(2) Cavity formation can also be an evolutionary process. Because 
comet nuclei have very low thermal conductivity”, direct sublimation 
of hexagonal water ice at the required depths would occur at an extre- 
mely low rate and can therefore be ruled out. It is possible, however, to 
sublimate more volatile ices like CO and CO, at lower temperatures. 
The fact that we do not see pits everywhere suggests that these super 
volatiles may not be distributed evenly inside the nucleus; such hetero- 
geneity has been observed on the surface of other comets (9P, ref. 22; 
103P, ref. 23). 

(3) A subsurface energy source may provide the heat necessary to 
sublimate a large cavity. A candidate is the phase transition in water ice 
from an amorphous to a crystalline structure. Crystallization has been 
used to explain many cometary activity features, and has been sug- 
gested as the underlying process for the distant outbursts of comet 
1P/Halley and the chaotic behaviour of comet 29P/Schwassmann- 
Wachmann”, or the outburst of comet 17P/Holmes”’. Different models 
have placed the crystallization front at depths ranging from a few 
metres to hundreds of metres**’’. We find that a subsurface cavity of 
the size of the observed pits would require the phase transition of at least 
600 kg of amorphous ice, corresponding to a sphere of 2 m in diameter 
at most (see Methods). The detailed calculation of the amount needed is 
beyond the scope of this Letter. 

Ultimately, regardless of the process creating the initial subsurface 
cavity, active pits indicate that large structural and/or compositional 
heterogeneities exist within the first few hundred metres below the 
current nucleus surface of comet 67P. Clusters of active pits and col- 
lapsed structures are signatures of former cavities underneath, and 
reflect the thermal history of the nucleus. 


Figure 4 | Pit formation mechanism by sinkhole collapse. A typical comet 
surface with a layer of dust covering a mixture of dust and volatile material. A 
subsurface heat source sublimates surrounding ices. This gas then escapes or 
relocates, thus forming a cavity. When the ceiling becomes too thin to support 


its own weight it collapses, creating a deep, circular pit with a smooth bottom. 
Newly exposed material in the pit’s walls can start to sublimate. Blue arrows and 
white lines describe the escape of volatiles and fracturing of the surrounding 
material; red arrow shows the collapse of the cavity ceiling. 
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Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 15 February; accepted 6 May 2015. 


1. 
2. 


Keller, H. U., Kramm, R. & Thomas, N. Surface features on the nucleus of comet 
Halley. Nature 331, 227-231 (1988). 

Soderblom, L. A. et a/. Observations of comet 19P/Borrelly by the miniature 
integrated camera and spectrometer aboard Deep Space 1. Science 296, 
1087-1091 (2002). 

Brownlee, D. E. et a/. Surface of young Jupiter family comet 81P/Wild 2: view from 
the Stardust spacecraft. Science 304, 1764-1769 (2004). 

Thomas, P. C. et al. The nucleus of comet 9P/Tempel 1: shape and geology from 
two flybys. Icarus 222, 453-466 (2013). 

A’Hearn, M. F. et a/. Deep Impact: excavating comet Tempel 1. Science 310, 
258-264 (2005). 

Schultz, P., Hermalyn, B. & Veverka, J. The Deep Impact crater on 9P/Tempel-1 
from Stardust-NExT. /carus 222, 502-515 (2013). 

Holsapple, K. A. & Housen, K. R. A crater and its ejecta: an interpretation of Deep 
Impact. /carus 187, 345-356 (2007). 

Vincent, J.-B., Oklay, N., Marchi, S., Hofner, S. & Sierks, H. Craters on comets. Planet. 
Space Sci. 107, 53-63 (2015). 

Belton, M.J.S. eta/. The origin of pits on 9P/Tempel 1 and the geologic signature of 
outbursts in Stardust-NExT images. Icarus 222, 477-486 (2013). 


. Belton, M.J.S. etal. Cometary cryo-volcanism: source regions and a model for the 


UT 2005 June 14 and other mini-outbursts on comet 9P/Tempel 1. /carus 198, 
189-207 (2008). 


. Keller, H. U. et al, OSIRIS — The scientific camera system onboard Rosetta. Space 


Sci. Rev. 128, 433-506 (2007). 


. Sierks, H. et a/. On the nucleus structure and activity of comet 67P/Churyumov- 


Gerasimenko. Science 347, aaal044 (2015). 


. Kirk, R. et al. Topography of the 81/P Wild 2 nucleus derived from Stardust 


stereoimages. Lunar Planet. Sci. XXXVI, 2244 (2005). 


. Basilevsky, A. T. & Keller, H. U. Comet nuclei: morphology and implied processes of 


surface modification. Planet. Space Sci. 54, 808-829 (2006). 


. Farnham, T. L. et al. Dust coma morphology in the Deep Impact images of comet 


9P/Tempel 1. Icarus 187, 26-40 (2007). 


. Feldman, P. D. et al. Hubble Space Telescope observations of comet 9P/Tempel 1 


during the Deep Impact encounter. Icarus 187, 113-122 (2007). 


. Tubiana, C. etal. 67P/Churyumov-Gerasimenko: activity between March and June 


2014 as observed from Rosetta/OSIRIS. Astron. Astrophys. 573, A62 (2015). 


. Waltham, T., Bell, F. G. & Culshaw, M. G. Sinkholes and Subsidence (Springer, 2007). 
. Thomas, N. et al. The morphological diversity of comet 67P/Churyumov- 


Gerasimenko. Science 347, aaa0440 (2015). 


. Weissman, P. R., Asphaug, E. & Lowry, S. C. in Comets // (eds Festou, M. C. et al.) 


337-357 (Univ. Arizona Press, 2004). 


66 | NATURE | VOL 523 | 2 JULY 2015 


21. 


22. 


23. 


24. 


25. 


26. 


27: 


Groussin, O. et al. The temperature, thermal inertia, roughness and color of 

the nuclei of comets 103P/Hartley 2 and 9P/Tempel 1. Icarus 222, 580-594 
(2013). 

Feaga, L. M., A’Hearn, M. F., Sunshine, J. M., Groussin, O. & Farnham, T. L. 
Asymmetries in the distribution of H2O0 and COz in the inner coma of comet 
9P/Tempel 1 as observed by Deep Impact. Icarus 190, 345-356 (2007). 
A’Hearn, M. F. et al, EPOXI at comet Hartley 2. Science 332, 1396-1400 (2011). 
Prialnik, D., Benkhoff, J. & Podolak, M.in Comets // (eds Festou, M. C. et al.) 359-387 
(Univ. Arizona Press, 2004). 

Hillman, Y. & Prialnik, D. A quasi 3-D model of an outburst pattern that explains the 
behavior of comet 17P/Holmes. Icarus 221, 147-159 (2012). 

Tancredi, G., Rickman, H. & Greenberg, J. M. Thermochemistry of cometary nuclei. 
|. The Jupiter family case. Astron. Astrophys. 286, 659-682 (1994). 

Marboeuf, U. & Schmitt, B. How to link the relative abundances of gas species in 
coma of comets to their initial chemical composition? Icarus 242, 225-248 
(2014). 


Acknowledgements OSIRIS was built by a consortium of the Max-Planck-Institut fur 


Sonnensystemforschung, Katlenburg-Lindau, Germany, the CISAS, University of 
Padova, 


taly, the Laboratoire d’Astrophysique de Marseille, France, the Instituto de 


Astrofisica de Andalucia, CSIC, Granada, Spain, the Research and Scientific Support 
Department of the European Space Agency, Noordwijk, The Netherlands, the Instituto 


Naciona 
Madrid, Spain, the Department of Physics and Astronomy of Uppsala University, 
Sweden, and the Institut fiir Datentechnik und Kommunikationsnetze der 


de Técnica Aeroespacial, Madrid, Spain, the Universidad Politéchnica de 


Technischen 


Universitat Braunschweig, Germany. The support of the national funding agencies of 


Germany (DLR), France (CNES), Italy (ASI), Spain ( 
Technical Directorate is acknowledged. This work was also supported by 
contract 1267923 to the University of Maryland ( 


EC), Sweden (SNSB), and the ESA 
ASA JPL 
.F.A'H. and D.B.). M.F.A’H. is a Gauss 


Professor at the Akademie der Wissenschaften zu Gottingen and Max-Planck-Institut 
fiir Sonnensystemforschung (Germany). This research has made use of NASA’s 


Astrophysics Data System Bibliographic Services. We thank H. J. Melosh 


or reviews 


and criticism. 


Author Contributions J.-B.V. led the study, identified the pits and measured their global 
parameters. D.B. analysed outbursts and phase change transitions and prepared the 
sinkhole model. S.B. performed the detailed morphology analysis. H.S., C.B., P.L, R.R., 


D.K. and H.R. are the lead scientists of the OSIRIS project. 
co-investigators who buil 
and associates and assistants who participated in the study. 


The other authors are 
and ran this instrument and made the observations possible, 


Author Information All data presented in this paper will be delivered to the ESA’s 
Planetary Science Archive and NASA's Planetary Data System in accordance with the 
schedule established by the Rosetta project and will be available on request before that 
archiving. Reprints and permissions information is available at www.nature.com/ 
reprints. The authors declare no competing financial interests. Readers are welcome to 
comment on the online version of the paper. Correspondence and requests for 
materials should be addressed to J.-B.V. (vincent@mps.mpg.de). 


©2015 Macmillan Publishers Limited. All rights reserved 


METHODS 


Detection of activity. Cometary activity is typically defined as the ensemble of 
physical processes forming the gas and dust coma that escapes from the nucleus. 
The main driver of activity is the solar insolation, which triggers the sublimation of 
volatiles trapped in the subsurface of the nucleus”*. The liberated gas expands into 
vacuum and drags along refractory grains from the surface. It has been known 
since the first in situ mission to a comet that this activity is not uniformly dis- 
tributed over the nucleus although the reasons for this anisotropy are not well 
understood’. 

From the uneven distribution of active sources on the surface, anisotropies in 
the coma arise in the form of narrow dusty streams (hereafter called ‘jets’), which 
expand straight from the nucleus for at least some distance”. Neither their source 
nor the physics of their formation have been fully explained yet, although many 
authors have proposed some explanations such as patches of enhanced H2O ice 
content, localized super-volatile release from steep-sided pits, or repetitive mini- 
outbursts*®. It is not clear whether these features are linked to volatiles at their 
footprint or if they trace the shock front between competing gas flows from nearby 
areas*'. 

In OSIRIS images, jets appear as fuzzy streams of bright material arising from 
specific areas on the nucleus surface. They are typically detected against the coma 
or a dark background, which can be either empty space or cast shadows. They are 
seen at all spatial scales, from large features spanning several tens of kilometres, 
down to the limit of spatial resolution. The smallest features detected so far are a 
few pixels across, which translates into a couple of metres at most. Their typical 
surface brightness is 10% to 40% higher than the surrounding background space, 
that is, the general coma’’. By monitoring the activity and observing these jets from 
different angles we can perform stereo imaging, reconstruct their three-dimen- 
sional structure and trace them back precisely to morphological features on the 
surface. 

Orbit integration. Observations and orbit reconstructions have shown that 
comet 67P had a close encounter with Jupiter that brought its perihelion from 
2.7 AU to 1.2 AU, in 1959 (JPL Horizons ephemerides, http://ssd.jpl.nasa.gov/hor- 
izons.cgi). We reconstructed its orbit before that time, on the basis of a well- 
established integration model”. For the initial conditions and their errors, we refer 
to the database of IMCCE (http://www.imcce.fr/langues/en/ephemerides/). We 
compute 200 clone orbits with random Gaussian small variations of the initial 
conditions considering their Gaussian errors. From these 200 clone orbits, we 
deduce the mean perihelion distance and its standard deviation (a). We find that 
84% of the orbits in the interval [(mean — @), (mean + o)] and the orbits beyond 
(mean + o) havea perihelion distance greater than at least 2 AU with a mean value 
always greater than 3 Au (Extended Data Fig. 1). 

Morphology, variegation and activity of the pits. The pit morphologies are 
presented in more detail in Extended Data Figs 3, 4. The complete list of 
OSIRIS images used for this study is given in Extended Data Table 2. 

The activity identified in Seth_01 covers the portion of the pit presented in 
Fig. 1, which displays different morphologies and textures. Therefore, it is not clear 
at this point that a specific texture and morphology is linked to the active pits. The 
detailed observations of the pits Ma’at_01 and Ma’at_02 seem to indicate that 
heavily fractured terrains are, however, favourably associated with activity. 
Extended Data Fig. 3 highlights the multiple joints that are also associated with 
the globular texture for Ma’at_01. Thus, fractured texture might be favourable for 
these active pits, probably because it allows the heat to propagate deeper into the 
interior and sublimate the ices. One other possible location for the activity inside 
the pits could be the terraces seen in Seth_01 (and maybe in Ma’at_02, although 
they are less developed). The two terraces highlighted in Extended Data Fig. 3 
cover around 50% of the circumference of the Seth_01 pit, and they match the 50% 
where activity has been identified so far. Therefore, the terraces could be the source 
of the activity if they expose some kind of fresh ice (or gas/ice from the coma falling 
back and depositing on this flat surface). 

The contact between the edges of the pits and the surroundings is different 
between the active pits Seth_01 and Ma’at_01. This could be the result of different 
mechanisms that formed them or the primordial morphology of the region. The 
bottoms of most pits are covered with a fine dusty material and boulders, which 
could be an indication of the relative age of these pits. Seth_01’s floor appears very 
flat (Extended Data Figs 3, 4), with a very smooth structure that does not contain 
any boulders. The floors of Seth_02 and Seth_03, pits where activity has been 
identified, share the same textural characteristics as Seth_01. The viewing condi- 
tions are less favourable for Ma’at_01; however, Extended Data Figs 3-5 show few 
boulders, all of small size. The same figures highlight that Ma’at_02 has a much 
higher number of boulders with larger sizes. These boulders may be an indirect 
way of estimating the relative age of the pits, because boulders accumulate with 
time. Thus, boulder-free floors represent the youngest pits. The relative age dating 
of these pits could also be speculated from the Ma’at_01 to Ma’at_03 series of pits. 
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With Ma’at_01 being the youngest and Ma’at_03 the oldest, one can see the 
degradation of the wall of the pits and the accumulation of material within 
the pit. The accumulation of boulders is rather limited in Ma’at_03, although 
the degradation of the rim is in a more advanced stage when compared to the 
other two, which confirms that it is the oldest. This low accumulation could be due 
to the geometry of Ma’at_03 or related to the original depth of the pit, which is 
most likely to have been smaller. The boulder-size distributions in the Seth and 
Maat pits are shown in Extended Data Figs 5, 6. 

We used additional images obtained through filters near the visible spectrum 

(blue: 480 nm, orange: 649 nm, infrared: 989 nm) to see if, in addition to the 
peculiar morphology, pits present a different colour to the rest of the surface. By 
using filter ratios to limit the effect of topography and illumination conditions, we 
found that the floor and walls of the pits exhibit the same less-red spectral slope as 
the active Hapi region (Extended Data Fig. 7). If we denote the reflectance by R, 
then we measure a ratio Rinfrared/Rpue = 1.8 in the active area (pits) and Rinfrarea/ 
Rbtue = 2.1 elsewhere on the nucleus. A full understanding of the implications of 
the compositional differences within the nucleus will require a dedicated invest- 
igation, but the difference in spectral slope observed in Extended Data Fig. 7 
already indicates that spectral variation is an intrinsic property of currently active 
regions on comet 67P. 
Pit growth. A major question is whether the d/D ratio of the pits can be explained 
by the current sublimation-driven retreat of the walls. We see jets arising from the 
edges of active pits (Fig. 1), indicating that erosion currently does occur. We first 
consider slowly excavating a pit by sublimating subsurface ice on the walls and 
floor and growing the depressions in both diameter and depth. We take as an 
example the most active pit (Seth_01). With a diameter of 220 m and a depth 
of 185 m, it has a volume of 7 X 10°m°, which corresponds to 3.3 X 10° kg of 
material if we assume a constant density of 470 kgm *. Current models of activity 
for comet 67P (refs 33-35) describe a global dust production rate of 9.3kgs 
at 3.5 AU, which translates into only 15gs_' of dust emitted from a single pit. 
Additionally, the varying latitudes and seasons limit the pits’ illumination to only a 
few hours per comet day for the walls. In some cases, the pit floor is only barely 
illuminated, if at all. Considering that most currently observed pits will be in polar 
night at perihelion and will not experience many changes in dust production rate, 
it would take more than 7,000 years to dig out one pit. 

Erosion is a second-order process that will slowly modify the pits after they are 

formed. This is supported by our observations; several active pits display alcoves 
within their walls, which we interpret as signatures of continued growth as a result 
of erosion by sublimation, block falls and wall retreats long after the pit formation, 
because these alcoves are always facing the direction of most insolation received 
per comet rotation. 
Phase transition. Crystallization has been invoked to explain many cometary activ- 
ity features, and has been suggested as the underlying process for the distant out- 
bursts of comet 1P/Halley*’, the chaotic behaviour of comet 29P/Schwassmann- 
Wachmann” and the violent outburst of comet 17P/Holmes”. 

From the ratio between the latent heat of the amorphous-to-crystalline transition 
(9X 10°J ke, refs 36, 37) and of the sublimation of hexagonal ice (0.334 J kg), 
the phase transition of 1kg of amorphous ice to crystalline ice provides enough 
energy to sublimate 270 kg of hexagonal ice, provided that crystallization occurs 
on a timescale short enough for the phase transition to effectively heat the surround- 
ing ice. Using typical low thermal inertia, Marboeuf and Schmitt” find that crystal- 
lization proceeds to a depth of only approximately 1 m. Other studies estimate that 
the crystallization front should extend to depths of between about 5 m and about 80 
m (ref. 37), or much greater depths”. 

Given the chaotic orbital evolution of comet 67P, we estimate that a 100 m deep 
layer could have recently reached the appropriate characteristics (100-120 K lead- 
ing to a phase transition on a timescale of months to a year) only if the local 
thermal inertia is high (250Jm~?K_!s _!? and above), a value more than five 
times what has been measured on comet 67P. For lower values of the thermal 
inertia, the phase transition can occur at a depth of 100-200 m only after a long 
period of time in the inner Solar System. A cavity could have formed much earlier 
in the history of the comet, even if the final collapse that produced the observed 
sinkhole occurred only recently. A subsurface cavity the size of the pits we observe 
would require the phase transition of approximately 600 kg of amorphous ice to 
crystalline ice. If we assume a density of 470 kgm * and a porosity of 70-80%, we 
obtain a 20-40% ice mass fraction in the nucleus (ice density is 920kg m, 
solid material is half silicate and half organics, with respective densities of 
3,500 kg m> and 2,200 kg m °). Therefore, 600 kg of ice would be embedded 
in 1,500-3,000 kg of cometary material and would occupy a volume of 3-6 m’, that 
is, a sphere of at most 2 m in diameter. Upon experiencing its phase transition, this 
pocket of amorphous ice would release enough heat to sublimate the surrounding 
crystalline ice in a volume equivalent to the observed pits. 
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Sinkhole model. A first order estimate of the stability of a cavity ceiling may be 
derived by treating the ceiling a beam failing under its own weight’***. Failure of 
this beam occurs when the bending moment exceeds the material’s tensile 
strength. Assuming the comet’s material is highly porous, the stable beam depth 
d= 6D* pal (8S), where D is the cavity’s diameter, p is the density of the material in 
the ceiling (assumed to be 470 kg m_*), a is the gravitational acceleration on the 
comet’? (5X 10°* ms ”) and S is the tensile strength of the ceiling material. For 
the tensile strength, we adopted an initial range between the lower limit of 10 Pa 
derived from overhangs on the surface’? and the upper limit of 10 kPa derived 
from the Deep Impact experiment*’. We further assume that the cavity is of 
approximately the same size as the resulting pit and that the depth of the pit is 
comparable to the depth of the original ceiling. 

Code availability. The code used to generate the orbital evolution of comet 67P is 
a direct implementation of a published model”. 
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Extended Data Figure 1 | Perihelion distance of comet 67P as a function of 
time. Solid line, mean value of the orbits integrated according to a Monte Carlo 
method. Dashed lines, standard deviation of the mean value. a, Perihelion 
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distance over the last 270 years, when comet 67P experienced several close 
encounters with Jupiter. b, The long term integration over the full dynamical 
lifetime of the comet (10,000 years). 
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Extended Data Figure 2 | Multiple views of the Seth_01 pit observed by the _ green arrow points to the same boulder and the blue arrow to the same ridge 
OSIRIS camera. a, Southern part of the pit wall; b, western part of the pit wall; _ inside the pit. The orange arrows point to terraces within the pit. The Seth_01 
c, d, eastern part of the pit wall with different illumination conditions; and pit is 220 m in diameter. 

e, southeastern part of the pit wall observed in the shadow. In all the images, the 
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Extended Data Figure 3 | Multiple views of the Ma’at_01, Ma’at_02 and fractures on the wall in e. In ¢, the white line is an artefact due to stretching of 


Ma’at_03 pits observed by the OSIRIS camera. a, b, Side views of the pits with _ the image to highlight the shadowed part. The Ma’at_02 pit is 130 m in 
different illumination conditions; c, opposite viewing conditions highlighting diameter. The blue, green and oranges arrows point to the same features in each 
the other side in the shadow; and d, e, detailed views of Ma’at_02 (d) and image. 

Ma’at_01 (e) from light reflection in the shadow. Note the clear cross-cutting 
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Extended Data Figure 4 | Additional views of the Seth_01 and Ma’at_01 boulders that have accumulated; note the activity located at the bottom. c, The 
pits. a, The floor of Seth_01 shows no accumulation of boulders; the same is __ floor of Ma’at_02 shows an asymmetric accumulation of boulders that could 
true for Seth_02 and Seth_03 (not shown). b, The floor of Ma’at_01 showsafew _ be the result of upper wall collapse. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


= a | ca —_ 
Extended Data Figure 5 | Boulder counts in Ma’at_01 and Ma’at_02. We Ma’at_01 and 68 on the floor of Ma’at_02. The diameter of the boulders 
counted boulders on the floor of Ma’at_01 and Ma’at_02. We used OSIRIS (in metres) is indicated by the coloured circles; see inset. Despite the 1.2 metres 
narrow angle camera (NAC) images with a resolution of 1.2 metres per pixel, _per pixel resolution, we were able to identify some boulders with a diameter 
acquired at 67 km from the comet nucleus centre. a, b, The illumination between 1.5m and 2.5 m (9 in Ma’at_01 and 15 in Ma’at_02), owing to the 
conditions are such that almost 80% of the floor of Ma’at_01 (a) and 95% ofthe _ presence of elongated shadows. The maximum boulder diameter is 4.3 m 
floor of Ma’at_02 (b) are illuminated and the pits are facing the observer, in Ma’at_01 and 9.0 m in Ma’at_02. 


which ensures an unbiased boulder count. We identified 23 boulders inside 
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Extended Data Figure 6 | Cumulative boulder-size distribution for 
Ma’at_01 and Ma’at_02. This distribution has a power index of —4.9*?5 for 
Ma’at_01 (left) and —4.2 Ba for Ma’at_02 (right), for boulder diameters 
greater than 3 m; the corresponding power laws are indicated the by the solid 
(fit) lines. Boulders smaller than 3 m in diameter are at the edge of our detection 
limit, meaning that the counts for these boulders are less reliable than the other 
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counts; consequently, they were not included when fitting the power law. The 
higher number of boulders in Ma’at_02 is consistent with the theory that 
boulders are debris that falls from the walls as the pit erodes, long after the initial 
formation of the pit. Error bars are defined as the square root of the cumulative 
number of boulders to reflect the increasing diameter uncertainty for small 
boulder sizes. 
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Extended Data Figure 7 | RGB view of the Seth pits and the Hapi region. onto a grey image showing the comet surface. The Hapi region and part of 
The red—blue-green components of this colour map represent colour ratios Seth appear with a blue hue, indicative of a bluer spectral slope than other regions 
between the reflectance signals measured at different wavelengths: red, 989 of the nucleus, which are typically red. The interior of Seth_01, Seth_02 and 
nm/649 nm; green, 480 nm/649 nm; blue, 649 nm. The colour map is overlaid Seth_03 have the same blue hue that is characteristic of the active Hapi region. 
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Extended Data Figure 8 | Modelled critical ceiling thickness for increasing example, a pit of 220 m in diameter and 185 m in depth (such as, Seth_01) 
cavity diameter and different tensile strengths. We predict the average tensile suggests that the collapsed layer had an average tensile strength of 50 Pa. 
strength of a collapsed layer using the dimensions of a pit (Methods). For 
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Extended Data Table 1 | List of pits considered in this paper 
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Diameter and depth measured on digital terrain model reconstructed from OSIRIS images by stereo-photogrammetry’”. Active pits have a mean d/D = 0.73 + 0.08; inactive pits have a mean d/D = 0.26 + 0.08. 
Maximum error is 20 m for the diameter and 5 m for the depth. Coordinates are given in the ‘Cheops’ reference frame’. 
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Extended Data Table 2 | List of images used 


nucleus center (km) | the surface (m/px) 
) NAC_2014-08-06TO2.19.14.570Z_ID30_1397549900_F22 | _2014-08-06T02.19.14.570Z_ID30_1397549900_F22 a 


Racaomnnen sz om. imam ma [ie fom 


WAC_2014-10-20T08.15.50.752Z_ID30_1397549000_F18 


Pits in Ma’at region Distance from Resolution on 
nucleus center (km) | the surface (m/px) 

| NAC_2014-08-21720.42.54.581Z ID30_1397549100_F22 | _2014-08-21T20.42.54.581Z_ID30_1397549100_F22 | 67.04 ae 

Se 66.54 
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WAC_2014-10-20T04.52.10.460Z_ID10_1397549600_F18 


Also shown is the distance from the comet nucleus centre that the image was taken and the resolution of the image. 
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A colloidal quantum dot spectrometer 


Jie Bao”? & Moungi G. Bawendi” 


Spectroscopy is carried out in almost every field of science, whenever 
light interacts with matter’. Although sophisticated instruments with 
impressive performance characteristics are available, much effort 
continues to be invested in the development of miniaturized, cheap 
and easy-to-use systems’. Current microspectrometer designs 
mostly use interference filters** and interferometric optics’ that 
limit their photon efficiency, resolution and spectral range”*. Here 
we show that many of these limitations can be overcome by replacing 
interferometric optics with a two-dimensional absorptive filter array 
composed of colloidal quantum dots'*’. Instead of measuring dif- 
ferent bands of a spectrum individually after introducing temporal 
or spatial separations with gratings or interference-based narrow- 
band filters, a colloidal quantum dot spectrometer measures a light 
spectrum based on the wavelength multiplexing principle’*: multiple 
spectral bands are encoded and detected simultaneously with one 
filter and one detector”, respectively, with the array format allow- 
ing the process to be efficiently repeated many times using different 
filters with different encoding so that sufficient information is 
obtained to enable computational reconstruction of the target spec- 
trum. We illustrate the performance of such a quantum dot micro- 
spectrometer, made from 195 different types of quantum dots with 
absorption features that cover a spectral range of 300 nanometres, by 
measuring shifts in spectral peak positions as small as one nano- 
metre. Given this performance, demonstrable avenues for further 
improvement, the ease with which quantum dots can be processed 
and integrated, and their numerous finely tuneable bandgaps that 
cover a broad spectral range, we expect that quantum dot micro- 
spectrometers will be useful in applications where minimizing size, 
weight, cost and complexity of the spectrometer are critical. 

Figure 1 illustrates different operating mechanisms that enable spec- 
tral analysis. Conventionally, spectral analysis is achieved using dis- 
persive optics such as gratings, which separate different light bands 
before their intensity is measured. An alternative approach uses a set of 
distinct, yet continuously tuneable, broadband absorptive filters and 
reconstructs the target spectrum from the set of measured broadband 
intensity distributions. In principle, this approach allows for a simpli- 
fied and cheaper spectrometer design that is well suited to miniatur- 
ization. But suitable broadband filters, with appropriate system 
compatibility and photo-stability, have so far not been readily avail- 
able. Here we show that colloidal quantum dots (CQDs), with their 
continuously tuneable, size-dependent bandgaps, offer a practical 
solution to the filter availability problem that has hindered the 
development of microspectrometers that use broadband adsorptive 
filters. 

CQDs are semiconductor nanocrystals with radii smaller than the 
bulk exciton Bohr radius, which leads to quantum confinement of 
electronic charges, with decreasing crystal size strengthening confine- 
ment and hence increasing the effective bandgap, and blue-shifting 
both optical absorption and fluorescent emission. Research over the 
past few decades has resulted in a vast library of CQDs. We can now 
tune absorption spectra continuously and finely over wavelengths 
ranging from deep ultraviolet to mid-infrared, simply by changing 
the CQD’s size, shape and composition’’. CQDs can also be printed 


into very fine patterns””’'. They have superior photo-stability than do 
dyes***’, In addition, dyes achieve different absorption profiles by 
altering their chemical compositions and structures, which leads to 
them having different chemical properties”*. In contrast, CQDs can 
provide different bandgaps but still share similar chemical properties, 
which simplifies synthesis and processing when a large number of 
materials are involved. CQDs are thus ideal broadband filter materials 
for use in wavelength multiplexing microspectrometers. 

Figure 2a illustrates how a basic quantum dot spectrometer oper- 
ates. The measurement of an incident-light spectrum by a quantum 
dot spectrometer is based on the measurement of the total transmitted 
intensity of the spectrum that passes through a given CQD filter. This 
intensity measurement is repeated for each CQD filter, resulting in a 
set of measured transmitted-light intensities. The original (incident) 
light spectrum is computationally reconstructed on the basis of this set 
of transmitted-light intensities. Spectral reconstruction is performed 
using linear regression, by finding a spectrum (the reconstructed spec- 
trum) that minimizes the sum of the squares of the differences between 
the set of measured transmitted-light intensities and a set of numer- 
ically computed intensities, which are the products of the recon- 
structed incident-light spectrum and the transmission spectra that 
are measured for each CQD filter, integrated with respect to wave- 
length. 

To illustrate the above method mathematically, consider an arbit- 
rary incident-light spectrum ®(A) (where J is the wavelength) that is 
transmitted through a set of np CQD filters, one at a time, resulting in 
measured total transmitted-light intensities I;, (where i = 1, 2,..., np is 
the filter number). One data point is generated per filter: 


=) OTA), i=1,2,....np (1) 
x 


where T;(A) is the transmission spectrum of CQD filter i. Because T;(A) 
for each CQD filter is predetermined (by measuring it), the only 
unknown is ®(A), which is evaluated at n; discrete 2 values and so 
corresponds to a set of n; variables. Although increasing n, for a given 
spectral range may increase the spectral resolution, being able to solve 
equation (1) limits n; to be at most equal to ng. In the ideal case, 
nz = np, which produces a set of linear equations with a unique solu- 
tion; in practice, however, measurement errors yield inconsistencies 
within the set of equations. Approximate solutions can nevertheless be 
obtained using least-squares linear regression. In such a case, a given np 
no longer provides an equal number of accurately reconstructed spec- 
tral data points and so n, < ng. The larger the error, the more filters are 
required to accurately reconstruct each spectral data point. 

The operation of a quantum dot spectrometer can be made more 
efficient by measuring the set of intensities simultaneously with a 
collection of light detectors, each of which is coupled to a designated 
CQD filter (Fig. 2b). Ifthe collection of light detectors is arranged into 
a miniaturized, two-dimensional detector array such as a charge- 
coupled device (CCD) array detector and the set of CQD filters is 
integrated into a CQD filter array and coupled to the detector array, 
then a quantum dot spectrometer can perform spectral measurements 
in a snapshot fashion, without scanning or switching filters. 
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Figure 1 | Comparison of spectrometer mechanisms. Using a grating-based 
spectrometer (top path), different bands of a light spectrum are first separated. 
Then the intensity of each band is measured individually. Using a narrowband- 
filter-based spectrometer (middle path), only one band of a light spectrum 
passes through one filter at a time, which results in the detection of the intensity 
of one band. Different bands of a spectrum are detected with different filters. 
These filters and detectors may be either spatially or temporally separated from 
each other; a set of spatially separated, discrete interference filters is shown here. 
Using a broadband-filter-based spectrometer (bottom path), multiple spectral 
bands pass through each broadband filter simultaneously, reaching a single 
detector and resulting in the detection of the intensity of multiple bands. This 
process is repeated with multiple broadband filters, each of which modifies the 
spectrum in a different way. The original spectrum is computationally 
reconstructed from the intensities detected from each broadband filter. This 
process is known as wavelength multiplexing. The quantum dot spectrometer 
presented here is a broadband-filter-based spectrometer that uses CQDs as 
broadband filter material. 


We demonstrate a quantum dot microspectrometer by fabricating 
one that operates over a spectral range of 300 nm (390-690 nm), and 
uses 195 different CQD filters (Fig. 3a, Methods subsection “Additional 
experimental information’). Each filter is unique, as a result of tuning 
the quantum dot size or composition (cadmium sulfide, CdS, or 
cadmium selenide, CdSe), with the effective bandgaps spread roughly 
evenly over the selected spectral range (Fig. 3b, Extended Data Fig. 1). 
Figure 3c shows a photograph of a quantum dot spectrometer formed 
by coupling a CQD filter array to a CCD, which was integrated into a 
digital camera (Methods subsection ‘Additional experimental 
information’, Extended Data Fig. 2). 

To illustrate the performance of the quantum dot spectrometer, 
Fig. 4a—d presents measurements of four arbitrary spectra (but with 
varying intensity levels and widths) that were generated using a 
white-light source and a set of optical filters. The quantum dot spectro- 
meter was able to reproduce all the major features of the spectra. 


a 
CQD filter set 


Source of 
spectra 


($A ostecr 
Figure 2 | Operation of quantum dot spectrometers. a, A basic quantum dot 
spectrometer is composed ofa set of CQD absorptive filters and a light detector. 
A spectrum is measured by placing the CQD filters in front of the detector 
one at a time and measuring the light intensities passing through each. 

b, Instead of measuring the light intensities using one detector and one filter at 
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Deviations between the reconstructed spectra and the reference spectra 
measured using a commercial spectrometer (Methods subsection 
‘Additional experimental information’) arise largely from subtle fea- 
tures, which are due to system measurement errors and the number 
of CQD filters used. The quantum dot spectrometer was also used to 
measure the emission spectra of five fluorescent CQD samples (Fig. 4e) 
with emission peaks ranging from 450nm to 650nm. As shown in 
Fig. 4f, the emission spectra measured by the quantum dot spectrometer 
match those measured with a standard spectrofluorometer (Methods 
subsection “Additional experimental information’). To demonstrate the 
spectral resolution of the quantum dot spectrometer, we used mono- 
chromatic light with a bandwidth of 2 nm, centred at different wave- 
lengths, and obtained accurate spectra (Fig. 4g); even wavelength shifts 
in the peak positions as small as 1 nm are accurately measured (Fig. 4h). 
Further analysis shows that the quantum dot spectrometer resolves two 
peaks separated by 2-3 nm (Methods subsection ‘Characterization of 
spectral resolving capability’, Extended Data Fig. 3). 

Spectral resolution and spectral reconstruction accuracy can be 
improved by reducing measurement errors, for example, by using a 
more stable calibration light source and a higher-performance CCD 
detector (Methods subsection “Effects of measurement errors on spec- 
tral resolving capability’, Extended Data Figs 4-6). Spectral resolution 
and spectral range can be improved by using a larger number of CQDs, 
which cover a broader spectral range (Methods subsection “Effects of 
the number of filters on spectral resolving capability’, Extended Data 
Fig. 7). Another strategy for improving spectral resolution and recon- 
struction accuracy is to use prior knowledge of the spectra (such as the 
sparseness of the signals**) and more sophisticated reconstruction 
algorithms such as L'-norm minimization”, principal component 
regression’, and genetic algorithms*® (Methods subsection ‘Effects 
of algorithms on spectral reconstruction accuracy’, Extended Data 
Fig. 8). Our simulations have also shown that the dynamic range can 
be increased by reducing measurement errors (Methods subsection 
‘Effects of measurement errors on dynamic range’, Extended Data 
Table 1). Finally, recent work on air-stable quantum dot solar cells” 
as well as the present experiments indicate that these quantum dot 
devices are sufficiently stable for long-term practical use (Methods 
subsection ‘Stability analysis’, Extended Data Fig. 9). 

Quantum dot spectrometers are promising high-performance micro- 
spectrometers because spectral resolution and spectral range can be 
increased simultaneously, simply by increasing the number of different 
CQDs used in the filter array. The wavelength multiplexing approach 
means that these improvements are achieved without sacrificing total 
photon efficiency. Another advantage of the quantum dot spectrometer 
is its ability to analyse light from a source with a wide angular distri- 
bution while maintaining the spectral resolution, as demonstrated by the 
use of an integrating sphere between the light source and the spectro- 
meter (Methods subsection “Additional experimental Information’); this 


‘| Detector 


a time, a more efficient quantum dot spectrometer (such as the one 
demonstrated here and illustrated in Fig. 3c) measures the set of intensities 
in parallel by using an array detector, with each detecting element dedicated to 
one CQD filter, all of which are integrated into a CQD filter array. 
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Figure 3 | CQD filters and an integrated quantum dot spectrometer. a, 195 
CQD materials in the form of filters. Each dot is a CQD filter made of one type 
of CQD material embedded in a polyvinyl butyral thin film. b, Transmission 

spectra for some of the CQD filters shown in a. Transmission spectra of all 195 


Wavelength (nm) 


CQD filters can be found in Extended Data Fig. 1. c, A quantum dot 
microspectrometer in the form of a digital camera with electronics and circuits, 
made from the 195 CQD filters and a CCD array detector, which is comparable 
in size to a US quarter. 
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Figure 4 | Quantum dot spectrometer measurements. a-d, Crosses 
correspond to the quantum dot spectrometer measurement of broadband 
spectra; solid lines correspond to reference spectra (measured using a 
commercial spectrometer). e, Fluorescent emission of five CQD samples under 
ultraviolet excitation. f, Markers correspond to measurements (using the 
quantum dot spectrometer) of the emission spectra of the five CQD samples 
shown in e; solid lines correspond to reference spectra (measured using a 


means it has a larger etendue than interferometric-optics-based spectro- 
meters. Lastly, simplifying the system by replacing dispersive optics with 
CQD filters allow for a notable size reduction of the device. Given the 
ability to make CCD pixels less than 10 jm in size and printed CQD 
features***’, we expect that it is possible to produce quantum dot spec- 
trometers that are orders of magnitude smaller than our demonstration 
system, without compromising spectral range or resolution. 

Previous applications of CQDs largely relied on generating materi- 
als (CQDs) with a high fluorescence quantum yield (ratio of the num- 
ber of photons emitted to that absorbed), but such materials are 
difficult to synthesize and maintain. However, high fluorescence 
quantum yield is not needed in a quantum dot spectrometer, which 
is based on the most reliable property of CQDs—their light absorption 
characteristics. This makes it feasible to replace the CQD materials 
used in the quantum dot spectrometer demonstrated here with more 
environmentally friendly, but relatively lower quantum yield, cad- 
mium-free materials such as zinc selenide (ZnSe) and indium phos- 
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spectrofluorometer). g, h, Measurements of monochromatic light spectra. The 
peak positions of the six monochromatic lights in g are 400 nm, 450 nm, 

500 nm, 550 nm, 600 nm and 650 nm. The peak positions of the six 
monochromatic lights in h are 500 nm, 501 nm, 502 nm, 503 nm, 504nm and 
505 nm. The inset of h compares the measured spectrum of 500-nm 
monochromatic light (crosses) and the reference spectrum (solid line). 


phide (InP)’’, which adds to the appeal of CQDs as a step towards 
producing single-chip high-performance spectrometers that are smal- 
ler than a mobile phone camera image sensor. Such spectrometers may 
be of benefit to space exploration, in surgical and clinical settings, to 
personalized health care and to lab-on-a-chip diagnostics, for example, 
wherever minimizing the size, weight, cost or complexity of spectro- 


meters is critical!*”. 


Online Content Methods, along with any additional Extended Data display items 


and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Additional experimental information. The 195 different CQD materials were 
taken from approximately 24 CdS and CdSe preparations using procedures from 
the literature’ ** and by taking aliquots during CQD growth. These CQD materi- 
als were purified according to the procedures reported with their syntheses. The 
fluorescent emission of the CQDs was quenched by the addition of a small amount 
of p-phenylenediamine* to prevent artefacts produced by emitted light from the 
CQDs. To convert them into CQD filters, the CQDs were concentrated and mixed 
with a polyvinyl butyral (PVB) chloroform solution’’. By placing a drop of the 
CQD/PVB solution onto a substrate and allowing evaporation of the chloroform, a 
CQD filter consisting of CQDs dispersed in PVB polymer (as shown in Fig. 3a) is 
produced. Transmission spectra for all 195 CQD filters (Fig. 3a, Extended Data 
Fig. 1) were measured by using a broadband light source (Deuterium Tungsten 
Halogen DH-2000 light source from Ocean Optics) as a reference and a commer- 
cial spectrometer (QE65000 spectrometer from Ocean Optics, approximately 
0.8 nm spectral bins). 

CQD filter arrays were fabricated by tightly printing small drops of CQD/PVB 
solutions close together onto a glass cover slip with an automatic pipette. To make 
the quantum dot spectrometer shown in Fig. 3c, CQD/PVB solutions were printed 
into an array whose dimensions were designed to match the size of a CCD; the size 
of each filter is about 0.5 mm across. The CCD is a SONY ICX274AL sensor with a 
chip size of 8.5 mm X 6.8 mm (horizontal times vertical), which was configured to 
provide 8-bit image data by Sensor Technologies America. The quantum dot 
spectrometer shown in Fig. 3c was formed by coupling the CQD filter array onto 
the CCD, with each CQD filter covering multiple CCD pixels. The transmission 
spectra of these CQD filters were characterized by comparing monochromatic 
light intensities recorded by the CCD with and without the CQD filters. The 
monochromatic light source was created by passing the light from an Oriel 
Research Xenon Lamp from Newport Corporation through a Triax 320 mono- 
chromator from Horiba. The output was then directed into an integrating sphere. 
The CCD sensor array was placed at the exit of the integrating sphere. The 
monochromatic light had a spectral bandwidth of 2 nm and was tuned to cover 
a spectral range from 390 nm to 690 nm in 1 nm steps. The transmission spectra of 
all individual CQD filters were measured together, one wavelength at a time. 
Specifically, before coupling the CQD filter array to the CCD, a reference was 
obtained by scanning the monochromatic light (as described above) and mea- 
suring the intensities at each pixel of the CCD. The same procedures were repeated 
after coupling the filter array to the CCD, resulting in a pair of intensities for each 
pixel for each wavelength. The ratio of each pair of intensities represents the 
transmission of one CQD filter at that wavelength. The transmission spectra of 
the set of CQD filters can be obtained from the set of ratios. Example transmission 
spectra of CQD filters in the array are shown in Extended Data Fig. 2. 

Reference spectra in Fig. 4a-d, g,h were measured by an HR2000 spectrometer 
from Ocean Optics. Reference spectra of CQD fluorescent emission in Fig. 4f were 
measured by a Horiba Fluromax-3 spectrofluorometer. 

The standard measurement error @ is 0.022, which corresponds to the standard 
deviation of the 195 I; values obtained from the measurements subtracted from the 
195 I; values that were computationally derived from equation (1) on the basis of 
reference spectra. 

p-Phenylenediamine is sensitive to certain oxidation agents, which could result 
in filter transmission changes’’. Encapsulation by polymers such as polyvinyl 
butyral, polymethyl methacrylate, polystyrene and polylactic acid, may offer a 
layer of protection. Further improvements may be made by replacing p-phenyle- 
nediamine with agents that are more resistant to oxidation and using CQDs with 
intrinsically low luminescent quantum yield such as those with a large amount of 
trap states’”. 

Characterization of spectral resolving capability. To characterize the spectral 
resolving capability of the quantum dot spectrometer, the spectrometer was tested 
by getting it to attempt to resolve closely positioned doublet peaks, with various 
peak separations (from 2 nm to 5 nm). These doublet peaks were also created using 
the Xenon Arc Lamp and the Triax 320 monochromator by producing a double 
exposure, with the monochromator tuned between two wavelengths and a 2nm 
bandwidth for each monochromatic light generated by this system. The doublet 
peaks were characterized by the HR2000 Ocean Optics spectrometer as a reference 
(Extended Data Fig. 3, left panels). In this case, the quantum dot spectrometer was 
able to resolve two peaks at a separation of 2-3 nm (Extended Data Fig. 3, right 
panels). 

Effects of measurement errors on spectral resolving capability. In the ideal case, 
when there is no measurement error, a quantum dot spectrometer resolves an 
unknown spectrum by solving a set of linear equations with a unique solution. The 
spectral range of the spectrometer depends on the wavelength coverage of the set of 
CQD filters used. The spectral resolution is determined by the number of different 
CQD filters and limited by the resolution of the calibrating system. 
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In practice, measurement errors and instrument noise is inevitable, and only 
approximate solutions are possible. In such a case, the spectral range is in principle 
not affected, but the spectral resolution decreases as the level of measurement error 
increases. The following simulation demonstrates this relationship. 

For this simulation, a spectral range of 390-620 nm was selected. The trans- 
mission spectra of the CQD filters T;() are the same spectra as shown in Extended 
Data Fig. 1, and were binned into approximately 1.6 nm intervals (147 data points 
over the 230 nm range). Given these bins and the requirement of at least three 
points to represent a single peak, the smallest spectral distance between two 
distinguishable simulated peaks is 3.2 nm. 

The simulations were performed according to the following procedure. An 
arbitrary light spectrum ®(/) was generated. The spectrum consists of three 
resolution-limited peaks (Extended Data Fig. 4a), which are spectrally distinguish- 
able from one another. Two of these peaks are separated by 3.2 nm, the smallest 
possible spectral distance before the two peaks merge into one broader peak. The 
intensities I; of this light spectrum after passing through each CQD filter were 
simulated according to the following equation: 


=. +6rangi) >) PATA) (2) 


where €ranaj are random numbers, one for each filter, sampled from a normal 
distribution centred at zero, with o = 0,0.0001, 0.001, 0.01, 0.1 representing dif- 
ferent (measurement) error levels. 147 out of the 195 CQD spectra are used to 
generate 147 I; in the following simulations. These J; are substituted into equation 
(1) to reconstruct the simulated light spectrum using least-squares linear regres- 
sion. For each error level, 100 individual simulations were performed and the 
reconstructed spectra were averaged. 

At the error level of o = 0, the simulation yields a reconstructed spectrum that 
matches the original light spectrum perfectly (Extended Data Fig. 4b) and the two 
peaks separated by 3.2nm are resolved. At the error level of ¢ = 0.0001, the 
simulation shows that the difference between the reconstructed spectrum and 
the original is extremely small (Extended Data Fig. 4c) and the two peaks separated 
by 3.2 nm are still resolved. As the error level rises to o = 0.001, the differences 
between the reconstructed and the original spectrum increase (Extended Data Fig. 
4d). The positions of the three peaks are nevertheless accurately represented. 
When the error level reaches ¢ = 0.01, the simulation no longer resolves the 
two peaks that are separated by 3.2 nm, but rather treats them as one broader 
peak (Extended Data Fig. 4e). As the error level increases to o = 0.1, the error 
becomes large enough to lose spectral accuracy (Extended Data Fig. 4f). 
Nevertheless, major peak information can still be obtained from the simulation. 
As the peaks become broader in the reconstructed spectrum, the intensities drop, 
keeping the total intensity (summed over the spectral range) consistent within the 
tolerance of the level of the error. 

To demonstrate the distribution of the 100 simulations that were averaged to 
produce the reconstructed spectra in Extended Data Fig. 4 (red dashed lines), we 
show the distribution of the 100 simulations for Extended Data Fig. 4d in 
Extended Data Fig. 5b. Specifically, we define P = (P;+P2)/2, where the P, 
and P, are intensity values in each of the 100 simulations averaged to produce 
the red spectrum, at the position of the two left-most peaks in the green spec- 
trum; we define V as the corresponding intensity value between the two peak 
positions and R = V/P. If R< 1, then the simulation produces two distinguish- 
able peaks at P, and P2, whereas if R= 1, then P, and P, become indistinguish- 
able in the simulation. We calculated R for each of the 100 simulations, and 
plotted it in a histogram (Extended Data Fig. 5b). The frequency decays rapidly 
with increasing R, as seen from the histogram. Of the 100 simulations, there 
were 60 with R <1 and 40 with R= 1. 

To explore the resolving capability of the system at error levels of 
o = 0.001, 0.01, 0.1, we generated more spectra (Extended Data Fig. 6a, c, e) that 
are similar to the spectrum shown in Extended Data Fig. 4a, but with slightly 
broader peaks and larger peak separations (about 4.8nm, 5.6nm and 13nm, 
respectively) between the two left-most peaks. When the error level is at 
o=0.001, the simulation produces very accurate reconstructed spectra 
(Extended Data Fig. 6b) and the two peaks separated by about 4.8 nm were easily 
resolved. At the error level of = 0.01, the simulation produces a reconstructed 
spectrum in which the two peaks separated by about 5.6 nm separation are just 
resolved (Extended Data Fig. 6d). At the error level of o = 0.1, the simulation 
suggests that the spectral resolving capability is notably reduced and only two 
peaks separated by more than about 13 nm are resolved (Extended Data Fig. 6f). 
Effects of the number of filters on spectral resolving capability. A remedy to the 
loss of spectral information and the reduced spectral resolving capability under 
conditions with relatively high levels of measurement error and instrument noise 
is to increase the number of CQD filters and thus the number of intensities J;. This 
is because when redundant spectral information is provided by the extra filters, the 
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error level is statistically reduced. In addition, the algorithm may be modified to 
reject data points with errors larger than some defined level. 

In the following simulations, the number of CQD filters is increased from 147 to 
195. The same procedures and original (incident) spectra are used as in the above 
simulations. The spectra reconstructed with 195 filters demonstrated improve- 
ments from those using 147 filters (Extended Data Fig. 7). However, the limited 
number of CQD filters produced in the experiments limits our ability to run 
further simulations with increased filter numbers, which we expect would dem- 
onstrate further improvements in resolution. 

Effects of algorithms on spectral reconstruction accuracy. The example shown 
below demonstrates the difference that a more sophisticated algorithm can make 
to the reconstruction of a spectrum. The red crosses in Extended Data Fig. 8a 
correspond to the reconstructed spectrum using the measurements from the 
quantum dot spectrometer; the blue line corresponds to the reference spectrum 
taken by the Ocean Optics spectrometer. In this case, the spectral reconstruction 
was performed using a least-squares linear regression. However, with a more 
sophisticated algorithm, based on generalized least squares and regularization, 
the same data are reconstructed to give the spectrum that is plotted with the red 
markers in Extended Data Fig. 8b (the blue line is the same reference spectrum). 
This shows that such algorithms reduce ghost (spurious) lines and therefore 
increase the reconstruction accuracy. 

Effects of measurement errors on dynamic range. The dynamic range of a 
spectrometer is defined as the ratio between the maximum intensity it can measure 
and the minimum signal that is distinguishable from the noise. In a grating-based 
spectrometer, the maximum intensity is typically equal to the maximum range of 
the analogue-to-digital converter (ADC) of the detector. The minimum signal that 
is distinguishable from the noise is represented by the standard deviation of the 
dark signal (that is, the dark counts of the detector). 

In a quantum dot spectrometer, however, the dynamic range should be calcu- 
lated on the basis of the maximum intensity and the dark counts from the recon- 
structed final spectrum, instead of those read directly from the detector. Therefore, 
the dynamic range is not only determined by the range of the ADC and the dark 
signal, but also by the reconstruction algorithm and the measurement errors. In 
the following simulation, we show how the level of the measurement errors affects 
the dynamic range of a quantum dot spectrometer. 

We consider a case in which a photodetector with a 16-bit ADC (which can 
produce an integer from 0 to 26_y= 65,535) is used. Similar to the above simu- 
lations, a spectral range of 390-620 nm is selected, the transmission spectra T;(/) 
of the 195 CQD filters are used and binned to have approximately 1.6 nm intervals 
(147 data points over the 230 nm range). For the maximum intensity, we consider 
an arbitrary incident-light spectrum (A) ofa solitary point peak with an intensity 
of 65,535. The intensities I; of (A) after passing through each CQD filter were 
computed according to equation (2). These J; are substituted into equation (1) to 
reconstruct the simulated light spectrum using least-squares linear regression. Imax 
is the maximum intensity in the reconstructed spectrum. This procedure is 
repeated 100 times at each error level. The 100 Imax are averaged to give Imax,mean- 

To calculate the standard deviation of the dark signal, we generate a series of 
dark counts Ig; = |65,535(0 + €anda.i)|> Where andi ave random numbers, one 
for each filter, sampled from a normal distribution, assuming both a standard 
deviation of o and a mean of o, with o = 0.0001, 0.001, 0.01,0.1 representing 
different error levels. The absolute value is taken to avoid any negative detector 


counts in this simulation. After generating the dark counts, the spectrum is recon- 
structed on the basis of these I. 7,4 is the standard deviation of the reconstructed 
spectral intensities across the entire wavelength range. This procedure is repeated 
100 times at each error level, and the 100 go,q are averaged to give G,4mean- 

The dynamic range at each error level is DR = Imax,mean/0o,dmean- Given in 

Extended Data Table 1 are the simulated I,,ax, mean» T,d,mean 2nd DR for a quantum 
dot spectrometer for different levels of measurement errors. This shows that the 
dynamic range of a quantum dot spectrometer increases as the level of measure- 
ment errors decreases. 
When the system is Poisson-noise limited, the dynamic range may be different to 
the above simulations because Poisson noise is sampled from a Poisson distri- 
bution instead of a normal distribution. However, the normal distribution is a 
good approximation of the Poisson distribution when the mean (photon number 
in this case) is large enough (> 1,000, for instance). Typically, when looking at the 
dynamic range, the signal level (number of photons) is relatively high. Therefore, 
we expect that the dynamic range of a Poisson-noise dominated quantum dot 
spectrometer follows the same trend as for the above simulations. 

For example, consider a system with gain of one electron per analogue-to-digital 
unit (ADU) and a quantum efficiency of 1/3. In this system, a 65,535 ADU would 
imply 196,605 photons, which corresponds to a standard error of about 443. 
Assuming the errors are Poisson distributed, we calculated Imax.mean ~ 64,500 
using the same methods as described above. This result is indistinguishable from 
the result for a normal distribution at an error level of ¢ = 0.0022 ~ 443/196,605. 
Stability analysis. The stability of the quantum dot spectrometer was tested under 
typical ambient laboratory conditions. Plotted in Extended Data Fig. 9 are two 
spectra measurements taken about six months apart, without recalibration. The 
peaks shown in both plots are at 400nm, 450nm, 500nm, 501 nm, 503 nm, 
505 nm, 550 nm and 600 nm; both sets of results reproduce the peaks accurately. 

In addition, it is known that absorption is one of the most stable properties of 
quantum dots. Commercial products based on emission of quantum dots” and the 
recent work on air-stable quantum dot solar cells” showed that high stability of the 
more sensitive properties such as fluorescence emission and charge transportation 
can nevertheless be achieved by carefully designing the chemistry of quantum dots 
and the conjugating materials. 
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Extended Data Figure 2 | Transmission spectra for selected CQD filters in the CQD filter array. In each subplot, the horizontal axis measures wavelength (in 
nanometres) and the vertical axis measures the transmission fraction (X 100%). 
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Extended Data Figure 3 | Characterization of the spectral resolution of the 
quantum dot spectrometer. The left panels are doublet peaks with peak 
separations of 2-5 nm measured by a HR2000 spectrometer as a reference. The 
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right panels are the same doublet peaks measured by the quantum dot 
spectrometer. 
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Extended Data Figure 4 | A simulated spectrum and spectral 
reconstruction. a, The green crosses represent the data points of a simulated 
light spectrum in intervals of 1.6 nm. The separation between the two left-most 
peaks is 3.2 nm. b, Spectral reconstruction (¢ = 0, np = 147). With o = 0, the 
spectral reconstruction process is equivalent to solving a set of linear equations 
that has a unique solution. The reconstructed spectrum (the red dotted line 
with red circles representing data points) matches the original (incident) light 
spectrum perfectly and the two peaks separated by 3.2 nm are resolved. 

c-f, Spectral reconstruction (o = 0.0001, 0.001, 0.01, 0.1, respectively, 
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np = 147) using least-squares linear regression. c, The original light spectrum is 
reproduced accurately and the two peaks separated by 3.2 nm are resolved. 

d, The original light spectrum is reproduced reasonably well and the peak 
positions accurately represented. e, The spectral reconstruction does not 
reproduce the original light spectrum very accurately and some spectral 
information is lost. f, The spectral reconstruction no longer reproduces the 
original light spectrum accurately and a lot of spectral information is lost. 
Nevertheless, major peak information can still be obtained from the simulation. 
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Extended Data Figure 5 | Distribution of simulations. a, Same as Extended Data Fig. 4d. b, The frequency (of the given value of R occurring in the 100 
simulations averaged to produce a) decay with increasing R. 
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a, c, e, Second, third and fourth simulated light spectra. The green crosses most peaks are resolved. d, Spectral reconstruction does not reproduce the 
represent the simulated data points, in intervals of 1.6nm. The separation original light spectrum very accurately and some spectral information is lost. 
between the two leftmost peaks is about 4.8 nm, 5.6 nm and 13 nm, respectively. | However, the two left-most peaks are resolved. f, Spectral reconstruction no 
b, d, f, Spectral reconstruction (o = 0.001, 0.01, 0.1, respectively, np = 147) longer reproduces the original light spectrum very accurately and some spectral 
using least-squares linear regression. b, The original light spectrum is information is lost. However, the two left-most peaks are resolved. 
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Extended Data Figure 7 | Comparison of spectral reconstructions with 

ny = 147 and ny = 195. a-c, The separation between the two left-most peaks is 
3.2 nm, 4.8nm and 13 nm, respectively. The spectral reconstruction 

(o = 0.001, 0.01, 0.1, respectively) using both filter sets reproduced the main 


features of the original light spectrum. An improvement in the reconstructed 
spectrum with 195 filters is observed, compared to that with 147 filters, 
where all other conditions are the same. However, due to the small relative 
difference between 147 and 195, the improvement is limited. 
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Extended Data Figure 8 | Effects of algorithms on spectral reconstruction _red markers is the spectrum based on the same quantum dot spectrometer 
accuracy. a, Plotted with the red markers is a spectrum measured by the measurement data, but reconstructed using a more sophisticated algorithm. 
quantum dot spectrometer and reconstructed using least-squares linear The blue line represents the same reference spectrum. 

regression. The blue line represents the reference spectrum. b, Plotted with the 
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Extended Data Figure 9 | Stability analysis. In both Measurement 1 and Measurement 2 (taken six months apart, without recalibration), the peaks shown are at 
400 nm, 450 nm, 500 nm, 501 nm, 503 nm, 505 nm, 550 nm and 600 nm. 
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Extended Data Table 1 


Simulation of the effects of errors on the dynamic range 


Error Level / 0 = 0.0001 0.001 0.01 0.1 
Inax,mean 65000 65000 62000 48000 
Oo,4,mean 2.4 24 240 2400 
DR 27000 2700 260 20 
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The North Atlantic Oscillation (NAO) is the major source of 
variability in winter atmospheric circulation in the Northern 
Hemisphere, with large impacts on temperature, precipitation 
and storm tracks’, and therefore also on strategic sectors such as 
insurance’, renewable energy production’, crop yields* and water 
management’. Recent developments in dynamical methods offer 
promise to improve seasonal NAO predictions’, but assessing 
potential predictability on multi-annual timescales requires docu- 
mentation of past low-frequency variability in the NAO. A recent 
bi-proxy NAO reconstruction’ spanning the past millennium sug- 
gested that long-lasting positive NAO conditions were established 
during medieval times, explaining the particularly warm conditions 
in Europe during this period; however, these conclusions are 
debated. Here, we present a yearly NAO reconstruction for the past 
millennium, based on an initial selection of 48 annually resolved 
proxy records distributed around the Atlantic Ocean and built 
through an ensemble of multivariate regressions. We validate the 
approach in six past-millennium climate simulations, and show that 
our reconstruction outperforms the bi-proxy index. The final recon- 
struction shows no persistent positive NAO during the medieval 
period, but suggests that positive phases were dominant during 
the thirteenth and fourteenth centuries. The reconstruction also 
reveals that a positive NAO emerges two years after strong volcanic 
eruptions, consistent with results obtained from models and satellite 
observations for the Mt Pinatubo eruption in the Philippines*’. 

In the North Atlantic sector, roughly half of the interannual vari- 
ability in winter atmospheric pressure is explained by fluctuations in 
the NAO. The NAO is characterized by a changing dipole of sea-level 
pressure between the Azores and Iceland, and has widespread impacts 
on temperature and precipitation across Eurasia, North Africa, 
Greenland and northern North America”. 

A recent study discarded the idea that there has been a strong direct 
radiative influence of solar forcing on Northern Hemisphere tempera- 
tures in the past millennium'’. Consequently, the hypothesis has emerged 
that changes in the NAO could explain part of the decadal-to-centennial 
variations in these temperatures. Also, NAO variability itself can be 
externally driven, and therefore allow for an indirect effect of external 
forcing on temperature not accounted for in the previous study". 

A millennial bi-proxy NAO reconstruction’ (hereafter NAOt,ouet) 
shows persistent positive phases during the Medieval Climate 
Anomaly (MCA, roughly ap 1000-1300; ref. 12), as does a multi- 
millennial NAO reconstruction from one lake sediment record in 
Greenland’’. Other proxy records from the Iberian Peninsula’ sup- 
port drier conditions during this period, a common fingerprint of 
positive NAO phases. However, some doubts have been raised regard- 
ing the persistence of a positive NAO phase during medieval times. 
For example, such persistent phases are not reproduced in any of 13 
different past-millennium climate simulations!’. Also, if the NAO 


were predominantly positive during this period, low temperatures 
should be recorded in southwest Greenland"®, and warm temperatures 
in northwestern Europe, but no indication of such anomalies appears 
in ice-core records and documentary sources'’. Proxy-based NAO 
reconstructions are challenging because of the nonstationarity of the 
statistical relationships and regional teleconnections on which they are 
based’*". Moreover, a ‘perfect-model’ study, using reanalyses and 
climate models as physically consistent surrogates of the real world 
(see Methods), has shown that the two proxy records involved in 
NAOrrouet are insufficient to guarantee a robust reconstruction 
throughout the entire last millennium”®. 

All of these factors point towards the need for a new multi-proxy 
NAO reconstruction for the past millennium, based on a larger set of 
proxy records that is representative of several NAO fingerprints in 
temperature, precipitation and droughts. Unlike NAO z,ouet, Which is 
based on 30-year smoothed data, our reconstruction is built at annual 
resolution, allowing us to test the effect of major volcanic eruptions 
and the 11-year solar cycle on the NAO. Although it is well established 
that a positive NAO-like sea-level pressure pattern is observed one to 
two years after major volcanic eruptions, most analyses focus on the 
Pinatubo eruption®” (the only major eruption to be observed by sat- 
ellite). The effect of solar activity is less well constrained, although 
recent studies suggest that the Arctic Oscillation (which is closely 
linked to the NAO) is modulated by the 11-year solar cycle”*. 

Our study is based on 48 annually resolved proxy time series (Fig. 1a 
and Extended Data Table 1), which are significantly correlated (P value 
<0.10) with the longest instrumental record of the NAO (hereafter 
NAOvinther”) in the period in which they overlap. All of these records 
were screened to fulfil two additional conditions: first, they encompass 
the eleventh to the twentieth centuries continuously; and second, they 
have a documented relationship in previous literature with a climate 
variable easily extractable from climate models. 

The NAO is reconstructed using principal component regression 
(PCR), following the procedure of a previous reconstruction from AD 
1400 onwards” (see Methods). To identify the robust features in the 
reconstruction, we generate an ensemble that explores the sensitivity 
to calibration, by extracting 100 random samples of 117 years from the 
period that is common to all proxies and the NAO (that is, aD 1823- 
1969). The remaining 30 years will later be used for validation. 

Two different alternative ensemble reconstructions are constructed 
by imposing restrictions on the initial proxy data set. The first 
ensemble (NAO,,, for “calibration-constrained’) uses as many proxy 
records as possible to guarantee an optimal calibration, verifying first 
that the proxy records still correlate significantly with NAOvinther 
within the calibration period. Figure 1b illustrates the individual con- 
tribution (as cumulated weights; see Methods) of the proxy records to 
the NAO... reconstruction. Each record participates at least once in the 
ensemble. This approach optimizes the statistical match with the 
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instrumental NAOvinther Within the calibration period, but cannot 
ensure that NAO-proxy relationships have an actual physical basis, 
or that these relationships are stationary in time. Both issues are tackled 
simultaneously in a second ensemble reconstruction (NAO,,; for 
‘model-constrained’). The proxy selection is first narrowed down with 
reanalyses (Extended Data Table 2) to guarantee realistic NAO tele- 
connections, and further constrained with eight past-millennium 
PMIP3 simulations (Paleoclimate Modelling Intercomparison Project 
Phase 3; Extended Data Table 3). This latter step ensures that the 
teleconnections are stable throughout the whole period. The NAOwn< 
ensemble reconstruction uses those proxies whose NAO-climate 
fingerprint is supported by at least one of the reanalyses and by half 
of the PMIP3 simulations. This final approach is highly restrictive and 
leads to a selection of only nine proxy records (Fig. 1c). If some tele- 
connections are unresolved by models, this method may discard proxy 
records that are actually sensitive to the NAO. 


Figure 1 | Proxy selection. a, Location of the 
initial 48 proxy records (ice cores, lake sediments, 
speleothems or tree rings) preselected for the 
reconstructions, using symbols and colours to 
represent the different types of archive. Some 
symbols have been slightly displaced to improve 
their visibility. The two proxies used in NAO Trouet 
are highlighted with yellow boxes. The true 
coordinates of each record are detailed in Extended 
Data Table 1. b, c, The colours of the symbols now 
represent the cumulated beta-weights (see 
Methods) associated with the proxies that 


-8 contribute to the NAO, (b) and NAO,,. 
ié (c) ensemble reconstructions. 
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The two ensemble reconstructions are displayed in Fig. 2a and b 
and compared to NAOvyouet With 50-year moving correlations 
(Fig. 2c). Only NAO,,. shows some periods of relatively good agree- 
ment with NAOz,oue, outside the instrumental period, coinciding 
with the largest-amplitude changes. However, none of the ensemble 
reconstructions exhibits persistent positive NAO phases in medieval 
times. Instead, a Student’s t-test on their means, applied in 100-year 
intervals, highlights a first period with non-zero mean negative 
values that is common to NAOme and NAOt;ouet (left-hand cyan 
horizontal bar in Fig. 2b). In aD 1150, the NAO,,, undergoes a rapid 
transition towards positive phases, which remain significantly differ- 
ent (P < 0.05) from zero for about 240 years (magenta horizontal bar). 
In the following four centuries, until the industrial period, the NAO. 
again depicts predominantly negative values. Similar features, albeit 
with differences in the timing of phase changes, are found in the 
NAO,, (Fig. 2a). 


Figure 2 | Ensemble NAO reconstructions. 

a, Ensemble of 100 yearly NAO,, reconstructions. 
The intermediate and light yellow envelopes 
represent, respectively, the total ensemble spread, 
and the regression uncertainties across the 
ensemble (as departures of +2 standard error, s.e., 
from the individual reconstructions). Also shown 
are the ensemble mean (dark yellow line), 

NAO vinther” (black) and NAOwyouer” (red). 
Horizontal magenta (or cyan) bars enclose 
centuries (marked by tick marks) with positive (or 
negative) mean NAO values that are significantly 
5 different (P < 0.05) than zero (see Methods). 

b, NAO,,. ensemble. c, 50-year running 
correlations between the mean ensemble 
reconstructions and NAO +, ouet. Dark-coloured 
lines represent correlations significant at the 95% 
confidence level (P < 0.05). 
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We now assess the difference in performance between the ensembles. 
By construction, NAO,, exhibits higher correlations with NAOvinther 
within the calibration period than does NAO; (Fig. 3a), as NAO. 
includes more proxy data, and therefore more principal components 
contribute to the multivariate regression (Extended Data Table 4). 
However, this does not translate into larger correlations for the valid- 
ation years (Fig. 3b). Indeed, NAO, is statistically overfitted with spuri- 
ous predictors, as shown by the higher validation scores for NAOme. In 
Figure 3c-i we verify that both ensemble reconstructions are consistent 
with other observational and reconstructed NAO time series (see also 
Extended Data Table 5). Once again, NAO», stands out, yielding gen- 
erally higher correlation coefficients with the set of NAO records. We 
also emphasize the overall good performance of the ensemble mean, 
with correlations being almost systematically above the median of the 
ensemble. The contribution of each proxy is estimated as the total sum 
of its beta-weights. In both ensemble reconstructions, and most promi- 
nently in NAO,,. the largest weights correspond to south and west 
Greenland ice cores (where the strongest impacts of the NAO have been 
reported’*"*) and to the speleothem from Scotland (Fig. 1b, c). 

To test the reliability of the NAO,,. reconstruction outside the 
observational period, we perform a perfect-model analysis of the 
PMIP3 simulations. We reproduce the same PCR strategy now applied 
to pseudo-proxies extracted from model outputs and perturbed with 
noise to mimic the actual proxies (Methods; ref. 25). To be more 
faithful to the original reconstruction, we use only the six PMIP3 
simulations that represent a stable relationship of the NAO with the 
nine proxies in Fig. 1c. Extended Data Fig. 1 represents the correlation 
of the ensemble NAO,,. pseudo-reconstruction with the simulated 
NAO in three different periods: the same two subsets used before for 
calibration/validation, and the long validation period ap 1000-1822. 
As in the actual reconstructions (Fig. 3), all models depict a notably 
large spread for the 30-year validation scores, estimated with fewer 
degrees of freedom and therefore being more sensitive to stochastic 
noise than are the calibration statistics. In the long validation period— 
measuring the true skill of the pseudo-reconstruction—correlations 
are always significant and remain close to the range of the calibration 
scores. Additionally, we compute two pseudo-reconstructions: 
NAOtrouet (using Morocco and Scotland pseudo-proxies) and 
NA Oj ehner (an improved version of NAO yy ouet: including hypothetical 
proxies from the Iberian Peninsula and Scandinavia, tested in a pre- 
vious perfect-model study”). The performance of these pseudo- 
reconstructions (in terms of correlation with the simulated NAO) is 
evaluated in a 50-year moving window throughout the entire millen- 
nial runs, and compared with the performance of the mean NAO. 
ensemble (Extended Data Fig. 2). Our method (represented by the 
mean NAO,,< ensemble) shows clearly higher correlations with the 
simulated NAO (rmedian = 0.44) than does NAOy,ouet (median = 0.21), 
but lower values than NAOjehner (Tmedian = 0.55). There is therefore 
scope for improving our model, provided that new proxy records 
become available at key locations, associated with strong NAO finger- 
prints that are unrepresented in our data set. 

Finally, we address the impact of external forcings on the yearly 
NAO,,, ensemble mean. We do not detect any significant relationship 
with two different total solar irradiance reconstructions”°”’, even at the 
11-year timescale (data not shown). In contrast, a composite analysis 
of the NAO response to the strongest 11 volcanic eruptions during the 
past millennium shows a significant (P < 0.05) positive phase appear- 
ing two years after the selected events (Fig. 4a). This result is robust, 
regardless of the volcanic reconstruction considered***° (Extended 
Data Table 6). One of the volcanic time series shows a second peak 
four years after the eruption, probably arising from misrepresentations 
in the timing of some events. By refining our selection (Fig. 4b)—now 
including 11 volcanoes represented in the three data sets, with eruption 
years cross-checked and corrected with historical records and 
information from local deposits (Extended Data Table 7)—we obtain 
a clearer response, only significant two years after the eruption when it 
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Figure 3 | Validation of the ensemble NAO reconstructions. a, b, Box-and- 
whisker plots of the correlations between the NAO.. (yellow boxes), or NAOme 
(purple boxes), reconstruction ensembles and the instrumental NAOvinther 
index, in the calibration (a) and validation (b) subsets, respectively. Small tri- 
angles indicate the value of the first significant correlation coefficient (P < 0.05) 
in each ensemble. c-i, Correlations of the same ensemble reconstructions with 
other reconstructed NAO time series (Extended Data Table 4). c-e, NAO 
time series from extended sea-level-pressure data sets****, following the 
definition in ref. 23. f-i, NAO time series from other reconstructions”***>**. 
Dots represent the respective correlations with the ensemble mean; these dots 
are filled (or unfilled) if the correlations are significant (or insignificant). 


surpasses the 95% confidence level (Fig. 4a). This positive NAO res- 
ponse can partly explain the predominance of positive NAO phases 
between AD 1150 and AD 1400, a period of increased volcanic activity. 
However, this response to volcanoes might be overestimated in our 
reconstruction owing to its strong dependence on temperature- 
sensitive proxies from Greenland (Fig. 1c), which can respond directly 
to the reduced radiative conditions. Yet the fact that, in the ice cores, 
the relation between the registered 5'*O and temperature is modulated 
by precipitation intermittency’’, along with the lack of NAO response 
for the first year after the eruptions, suggests that at least part of the 
estimated impact of volcanoes is dynamically driven. 

The identification of this volcanic influence is also consistent with 
observational and modelling studies***', and can be considered to be 
an additional validation of our reconstruction at the interannual time- 
scale. Adding this finding to the verification via pseudo-proxies and 


2 JULY 2015 | VOL 523 | NATURE | 73 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


o 
a 


oO 


Composite NAO 


i} 
So 
a 


Lag -1 


i mM iL. a I it ll 


-2.04 ; ; 4 


In phase Lag 1 Lag 2 Lag 3 Lag 4 Lag 5 


i) 
Oo 
T 


-_ 
Oo 


NAO index 
oO 


Lag -1 In phase Lag 1 Lag 2 Lag 3 Lag 4 Lag 5 


Figure 4 | NAO response to large volcanic eruptions. a, Composite NAOm< 
response to the 11 strongest eruptions in three volcanic forcing reconstructions 
(purple line, ref. 29; blue line, ref. 28; green line, ref. 30; Extended Data Table 6), 
and to a refined selection of 11 intense volcanoes common to the three data sets 
(black line) (see Methods). Significance is assessed following a Monte Carlo 
approach with 1,000 random selections of 11 years from the NAO,,. mean 
reconstruction. The horizontal lines delimit the thresholds of significance at 
95%. The ‘lag’ refers to the number of years after the volcanic eruption. b, NAO 
values preceding and following the onset of the 11 intense eruptions common 
to the three reconstructions (Extended Data Table 7). 


earlier reconstructions, we conclude that the NAO,,, ensemble pre- 
sents a robust, highly resolved estimate of NAO variability in the last 
millennium, providing new insights into Northern Hemisphere cli- 
matic and environmental changes during this period, and ruling out 
the persistence of positive phases during the MCA. Proxy records are 
commonly used as a benchmark with which to assess the ability of 
climate models to reproduce past variability. Here, we have used rea- 
nalyses and state-of-the-art climate simulations to improve the reliabil- 
ity of proxy reconstructions. We have demonstrated the added value of 
constraining the selection of proxies with reanalysis and model out- 
puts, and encourage the use of this approach in future reconstructions. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Analysis of significance. In the initial proxy selection of Fig. 1a, proxies are 
required to significantly correlate with NAOvinther at the 90% confidence level. 
This threshold is not particularly restrictive and allows the identification of a large 
number of candidate time series that are potentially linked to the NAO. Since the 
final selection of proxies is later constrained during the calibration process, and 
also with reanalysis and model outputs, the risk of including time series with a 
spurious relationship with the NAO is small. All subsequent correlations are 
assessed by correcting the sample size to take into account the autocorrelation. 
The confidence level is kept at 90% when proxies are evaluated and in the different 
PCR steps, and increased to 95% for the estimation of the validation/calibration 
scores, and when comparing the reconstruction against other data sets. For the 
composite analysis of volcanic events (Fig. 4), significance is assessed at the 95% 
confidence level, following a Monte Carlo approach with 1,000 random equivalent 
selections. A confidence level of 95% is also applied for a two-tailed Student’s t-test 
in assessing when a 100-year running average of the NAO ensemble mean is 
different from zero (horizontal bars in Fig. 2). 

Selection of proxies. The initial set of proxy records (Extended Data Table 1) is 
the result of an extensive search throughout different palaeoclimate data repos- 
itories, such as the International Tree-Ring Data Bank (ITRDB; https://www. 
ncdc.noaa.gov/data-access/paleoclimatology-data/datasets/tree-ring), http://www. 
ncdc.noaa.gov/data-access/paleoclimatology-data and_http://www.pangaea.de/. 
Many proxies (>150) from the Northern Hemisphere were found to meet the 
two imposed criteria: annual resolution and continuous time coverage from the 
eleventh to the twentieth century. These proxies were further screened for their 
correlation with NAOvinther at the 90% confidence level. Only the 48 proxies used 
in our analysis met this condition. 

Climatic interpretation of proxies. Both the seasonality and the climate variable 
associated with each proxy were extracted either from the original references 
detailed in Extended Data Table 1, or from subsequent references making use of 
them. For instance, a few tree-ring proxy records (such as Lily Lake) are inter- 
preted herein in terms of annual temperature as they have already contributed to 
several multi-proxy reconstructions for past global and continental temper- 
ature*’**. When two possible climatic interpretations of the same proxy record 
exist, we kept the one leading to the highest correlations with the NAO in the 
twentieth century reanalysis (for example, Mammoth Peak as winter precipitation 
instead of summer temperature). 

Potential biases in the representation of low-frequency variability. No post- 
processing has been applied to any of the proxy time series considered. All chro- 
nologies therefore correspond to the source data as they were originally provided. 
Note that the tree-ring records might lack some part of their low-frequency vari- 
ability owing to the age-trend removal techniques. The degree of variance sacri- 
ficed may vary substantially from one record to another, because some research 
groups privilege more conservative detrending methods, and others prefer to 
retain as much low-frequency variability as possible. Ice-core records are also 
generally corrected for postdepositional effects, such as firn diffusion, or ice flow 
in the Greenland ice sheet. Firn diffusion affects only subannual variability, and 
may introduce biases in the seasonally resolved records. It does not affect low- 
frequency variability or trends. Ice flow is important only in regions of significant 
ice-flow rate, such as the DYE-3 drill site, and requires flow models to be corrected; 
it might affect to some degree the realism of the long-term trends. Other potential 
low-frequency biases relate to bandwidth speleothem measurements and varved 
lake sediments that need to be corrected for nonhomogenous growth rates and 
compactness, respectively. More generally, all proxy records are affected to some 
extent by local environmental changes (such as wind scouring for ice cores), which 
can perturb the recorded climatic signal. This is why it is important to perform 
multi-proxy reconstructions—so that most of these potential biases can be can- 
celled out, and the common climate signal better identified. 

PCR calculation. The reconstruction method is based on a previous study” that 
uses a nested PCR to gradually adjust to the changing number of source proxy 
records available in consecutive periods of 25 years. Here, we consider only proxies 
that extend with no interruption from the eleventh to the twentieth century, so 
only one PCR needs to be calculated (for each ensemble realization). The PCR 
approach is applied as follows: all selected proxy records are first standardized with 
respect to the common period aD 1823-1969, and used to define a matrix (in time 
and space) for the period common to all proxies. This period can change depend- 
ing on the actual selection, the shortest being from AD 1073 to AD 1969 when the 
proxy record from Firth River is available. For each ensemble realization, the 
corresponding subset of years previously set for calibration is extracted from that 
matrix. A principal-components analysis (PCA) of this reduced matrix is per- 
formed afterwards. Preisendorfer’s rule N* is then applied to discard those prin- 
cipal components (PCs) whose explained variability is undistinguishable (at the 
95% significance level) from that of a matrix of red-noise processes composed of 
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the first autoregressive models, AR(1), of the original proxies. For this assessment, 
300 independent gaussian white-noise matrices are generated (with the same 
dimensions as the reduced one), and a PCA is applied to each of them. In each 
realization, the PCs are organized in decreasing order of their variance across the 
Gaussian set. We then take all of the first PCs, and establish the 95th percentile of 
the associated variances. This step is repeated from the second to the last PC. The 
95th-percentile values thus obtained are used as significance thresholds for the 
truncation of the original PCs. The aim of this truncation is to retain all the modes 
with relevant climate information, and to use them to produce a multivariate linear 
regression model of the NAO. To extend the reconstruction back to the eleventh 
century, the complete matrix of predictors is projected into the eigenvectors of the 
selected PCs (also known as empirical orthogonal functions, EOFs), fitting by least 
squares. These projections describe the variability associated with each EOF over 
the whole time extent of the matrix of predictors, and coincide with the original 
PCs in the calibration period. These are the series feeding the multivariate model to 
produce the actual NAO reconstruction. To correct for the loss in variance inher- 
ent to the multiple regression, the final NAO time series is restandardized with 
respect to the calibration period. No trends are removed before the PCR analysis, 
since they are also essential to reproduce the past NAO variability. This is a 
standard reconstruction practice’. 

Proxy contribution to the ensemble reconstruction. In each individual PCR 
reconstruction, the beta-weights of the proxies are obtained as in ref. 24: we 
multiply the vector of the standardized regression coefficients (also known as 
beta-weights on the PC space) by the matrix of the retained eigenvectors. In 
practice, for an individual proxy record i, this corresponds to: 


BH= Yo Geli) 
j=l 


where f (i) denotes the beta-weight of the given proxy, m the number of predictors 
(that is, PCs) participating in the multivariate regression, c; the regression coef- 
ficient for the predictor j, and ; the loading associated with the proxy i in the 
eigenvector of predictor j. The beta-weight of the proxy i in the ensemble mean 
reconstruction is obtained as the sum of its respective beta-weights in the indi- 
vidual realizations. 

Estimation of reconstruction uncertainties. Our method accounts only for 
regression-based uncertainties associated with the residuals in the calibration 
period. These are represented by the s.e. of the regression, calculated as the square 
root of the sum of the squared residuals divided by the degrees of freedom: 


._, Sn. [Ea 
~ (n—2) (n—2) 


where n represents the number of time steps in the calibration period, ¢ the 
residuals, y the predicted, and the reconstructed variable. 

Standard errors are calculated for each of the 100 selected individual reconstruc- 
tions. In each of them, the uncertainty band is set at +2 s.e. The envelope of these 
bands across the different ensemble members is included in Fig. 2 to describe the 
total uncertainty range of each ensemble reconstruction. The perfect-model ana- 
lysis supports the validity of this approach, with only about 2% of years falling 
outside the + 2 s.e. envelope. The characterization of other sources of uncertainty 
(such as chronological errors) lies beyond the scope of our analysis. 

Sensitivity to the reconstruction choices. Several alternative ensemble recon- 
structions have been performed to test the sensitivity to some of the choices in our 
approach. Overall, the reconstructions and the main findings remain coherent. For 
instance, increasing the validation period to 40 years (and reducing the calibration 
subsequently to 107) barely affects the final NAO,,. ensemble mean reconstruc- 
tion. Similar results also hold if Preissendorfer’s rule-N is based on a white noise 
floor assumption, or if proxy selection is constrained only through reanalyses or 
PMIP3 simulations. Additional criteria for ensemble generation, such as subsam- 
pling the proxies depending on their kind, are found to increase the uncertainty 
bars because of the poor skill of some members, but have a minor effect on the 
variability described by the mean ensemble reconstruction. 

Model-constrained ensemble reconstructions. Given some known deficiencies 
in the reanalyses in terms of accurately representing variables such as precipitation 
in high-elevation regions (for example, Greenland or the Alps), and considering 
that reanalyses have different spatial resolutions and cover different time periods 
(Extended Data Table 2) with different degrees of confidence, we apply a relaxed 
constraint, so that any proxy with a significant relationship with the NAO in any of 
the four reanalyses is kept. Similarly, since not all PMIP3 models are equally 
realistic in the representation of the NAO fingerprints on surface climate“, but 
keeping in mind that with them we test the robustness and stability of the tele- 
connections, the ensemble NAO,,, reconstruction is additionally constrained to 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


those proxy records that correlate significantly with the NAO in at least half of the 
eight PMIP3 simulations. 

Perfect-model approach. This technique is used to test the adequacy or realism of 
different assumptions or methods. It is based on the use of model outputs as 
substitute climate records that can be used, for instance, to assess the performance 
of different reconstruction strategies*”’, or the potential of decadal predictions 
(both statistical*? and dynamical***). The model is assumed to be ‘perfect’ in the 
sense that it describes an idealized surrogate reality, where climate variations are 
physically consistent, without any measurement uncertainty, and are accessible 
everywhere without restrictions and for the whole extent of the simulations. In our 
analysis, we mimic the original proxies with model outputs (see below), and use 
these pseudo-proxies to reproduce the different steps of our reconstruction 
approach, but within the model world. One of the main advantages that this 
perfect model framework offers is that validation scores can also be estimated 
outside of the observational period AD 1823-1969. This is particularly interesting, 
since it allows a validation over a long time frame (AD 1000-1822), and thus 
provides a test of the stability of the assumed teleconnections. 

Pseudo-proxy definition. All pseudo-proxies are defined as a2 X 2 average of the 
related climate variable centred on the proxy location, using the same reported 
seasonal window (Extended Data Table 1). These are the series used to identify the 
proxies that are consistent with the reanalyses and the PMIP3 simulations 
(Fig. 1c). The pure simulated climatic signal is later perturbed with noise for the 
perfect-model analysis. The noise is estimated as an AR(1) process that describes 
the residual of subtracting the normalized real proxy from normalized obser- 
vational data at the same location. Temperature data are retrieved from 
Berkeley Earth (http://berkeleyearth.org/data) and precipitation from the 
Climate Research Unit (http://badc.nerc.ac.uk/browse/badc/cru) data sets. Our 
approach therefore mimics the actual noise that arises from relating a proxy signal 
to a single climate variable. Ten alternative pseudo-proxies are generated for ten 
different realizations of the noise (Extended Data Fig. 3). For the perfect-model 
analysis, we use the pseudo-proxy whose correlation with the simulated NAO is 
closer to that of the actual proxy with NAOvintner This is done to be more 
representative of the real proxy sensitivity. The hypothetical pseudo-proxies from 
Spain and Norway (where no real proxy is yet available), which are necessary 
for the NAOjehner pseudo-reconstruction, were generated by adding Gaussian 
noise with properties similar to those of the twentieth century reanalysis at 
these locations. 

Identification of strong volcanic eruptions. This analysis uses the three latest 
published reconstructions of past volcanic activity covering the past millennium. 
Two of them’*”? combine volcanic-related information from Greenland and 
Antarctic ice cores, although they differ in the source records and the calibration 
strategy. The third reconstruction” is based exclusively on an extensive array of 
Antarctic ice cores, most of them not included in the former reconstructions. The 
combination of multiple ice-core records is an important step, accounting for age- 
scale and deposition uncertainties. The three reconstructions show important 
discrepancies regarding the timing and relative magnitude of some eruptions. 
This is illustrated through the selection of the 11 strongest events in each volcanic 
reconstruction (Extended Data Table 6). We first identify the strongest volcano 
according to the related reconstructed variable, and screen out all dates that are 
3 years before or after the eruption (to avoid double selections). This process is 
repeated with the remaining dates until 11 events have been identified. Note that 
even when a specific eruption is present in the three volcanic series, the uncertainty 
in the year of occurrence can be as large as 2 years (for example, 1284, 1286, 1285). 
To reduce this timing uncertainty, and to ensure that we select only well supported 
volcanic eruptions, we extract the 25 largest eruptions in each volcanic chronology 
(not shown), and compare them to identify the episodes common to all reconstruc- 
tions (within a +3 year range). This leads to a final selection of 11 common vol- 
canoes (Extended Data Table 7). The final adopted dates for the composite analysis 
in Fig. 4 are extracted from historical sources (http://www.volcano.si.edu/)** or other 
reliable sedimentary records*’. For the three unknown eruptions, for which no 
additional information is available, we took either the most repeated date, or the 
average date of the three reconstructions. 

Code availability. The code used to produce the model-constrained ensemble 
reconstruction is available in the Supplementary Information. 
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Extended Data Figure 1 | Verification of the pseudo-reconstructions. Box- anda long validation period from aD 1000-1822 (possible thanks to the perfect- 
and-whisker plots showing the correlations between the six pseudo- model approach). In this latter case, since the period is the same for all the 
reconstructed NAO,,. ensembles and the corresponding true simulated NAOs, — ensemble realizations, the correlation is also calculated for the ensemble mean 
in three independent subsets of years (from left to right for each ensemble): the _(filled coloured dots). Red horizontal lines indicate the first significant 

117 years selected for calibration (specific to each ensemble realization), the correlation coefficient (P < 0.05) of the ensemble. 

remaining 30 years that need to be validated (following the original strategy), 
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Extended Data Figure 2 | Comparison of three alternative reconstructions. 
The figure shows probability density functions (PDFs) for the 50-year moving- 
window correlations between the simulated NAO and three alternative pseudo- 
reconstructions in the ensemble of PMIP3 simulations (Extended Data 

Table 3): a, NAOzyouets bs, NAO ¢ NAOzehner- Vertical dotted lines represent 
the median correlation for the ensemble. Each coloured line represents a 
different pseudo-reconstructed NAO ensemble. 
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Extended Data Figure 3 | NAO-pseudo-proxy relationships. 

a-f, Correlations between the simulated NAO and the associated pseudo- 
proxies in the PMIP3 runs. Pseudo-proxies are defined from a pure climatic 
signal (coloured asterisks), and subsequently perturbed with AR(1) noises (thin 
green lines). For the perturbed pseudo-proxy definitions, ten different 
realizations of the noise are considered. The correlations between the actual 
proxies and NAOvinther in the common period AD 1823-1969 are also shown 
for comparison (thick black crosses). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 | Proxy description 


N Site Archive Proxy type LON LAT Time Span Poe pire ol Seasonality Refs 
1 B18 Ice core Snow Accumulation -36.4 76.6 1000-1992 0.19 Precip Annual 48 
2 Crete Ice core Snow Accumulation -39.7 71 1000-1973 -0.19 Precip Annual 49 
3 GISP2 Ice core Snow Accumulation -40.9 72.5 1000-1988 -0.12 Precip Annual 50 
4 Agassiz (A79)* Ice core 8'8ot -73.1 80.7 1000-1972 -0.18 SAT Annual 51 
5 Crete Ice core 3%ot -37.3 71.1 1000-1973 -0.31 SAT NDJFMA 52 
6 DYE-3 (stack) Ice core 3°ot -43.8 65.2 1000-1978 -0.39 SAT NDJFMA 52 
7 GISP2 Ice core aot -38.5 72.6 1000-1987 0.21 SAT Annual 53 
8 GRIP Ice core 3'8ot -37.5 72.6 1000-1979 -0.13 SAT NDJFMA 52 
9 Allos Lake Lake Sediment Flood deposit thickness 6.7 44.2 1000-2009 0.11 Precip SON 54 
10 Donard Lake Lake Sediment Varve thickness -61.4 66.7 1000-1995 -0.14 SAT JJA 55 
11 Hvitarvatn Lake Lake Sediment Varve thickness -19.8 64.6 1000-2000 -0.11 SAT JJAS 56 
12 Lower Murray Lake Lake Sediment Mass accumulation -69.5 81.3 1000-1969 -0.18 SAT JJA ard 
13 Crystal Cave Speleothem 8"8ot -18.8 36.6 1000-2007 -0.15 SAT Annual 58 
14 Uamh an Tartair Speleothem Bandwidth -4.9 58.1 1000-1995 0.24 Precip DJFM 59 
15 Balkan Peninsula Tree ring Tree-ring width 20.0 41.0 1000-2008 -0.10 SAT February 60 
16 European Alps Tree ring Tree-ring MXD 75 46.0 1000-2004 0.11 SAT JJAS 61 
nrg Black Swamp Tree ring Tree-ring width -91.3 35.2 1019-1980 0.16 SAT Annual 62 
18 Mayberry Slough Tree ring Tree-ring width -91.3 35.6 1000-1990 0.12 SAT* Annual 63 
19 Avam-Taimyr Tree ring Tree-ring width 101.0 72.0 1000-2000 -0.10 SAT July 64 
20 San Gorgonio Tree ring Tree-ring width -116.8 34.1 1000-1970 0.19 SAT Annual 65 
21 Flower Lake Tree ring Tree-ring width -118.4 36.8 1000-1987 -0.14 Precip NDJFM 66 
22 Timber Gap Upper Tree ring Tree-ring width -118.6 36.5 1000-1987 -0.13 Precip NDJFM 66 
23 Cirque Peak Tree ring Tree-ring width -118.2 36.5 1000-1987 -0.13 Precip NDJFM 66 
24 Mammoth Peak Tree ring Tree-ring width -119.3 37.9 1000-1996 -0.10 Precip NDJFM 66 
25 Boreal Plateau Tree ring Tree-ring width -118.6 36.5 1000-1992 -0.14 Precip NDJFM 66 
26 Upper Wright Lakes Tree ring Tree-ring width -118.4 36.6 1000-1992 -0.14 Precip NDJFM 66 
27 Hamilton Tree ring Tree-ring width -118.9 37.0 1000-1988 -0.22 Precip NDJFM 66 
28 Lily Lake Tree ring Tree-ring width -105.6 40.3 1000-1998 -0.12 SAT* Annual 67 
29 Firth River Tree ring Tree-ring MXD -141.6 68.7 1073-2002 0.14 SAT JJA 68 
30 Choctawhatchee River Tree ring Tree-ring width -85.9 30.5 1000-1992 0.14 saT* Annual 63 
31 Forfjorddalen Tree ring Tree-ring 6'°C 15.7 68.8 1000-2001 -0.14 Cloud% JJA 69 
32 French Alps Tree ring Tree-ring width 75 44.0 1000-2007 0.10 SAT AMJJA 70 
33) Big Cypress Tree ring Tree-ring width -93.0 32.3 1000-1988 0.10 satt Annual 71 
34 Morocco Tree ring Tree-ring width -5.0 33.8 1049-2001 -0.24 SPI FMAMJ 72 
35 Yellow Mountain Ridge | Tree ring Tree-ring width -111.3 45.3 1000-1998 -0.13 Precip NDJFM 73 
36 ae id | Tree ring Tree-ring width 111.3 45.3 1000-1998 0.14 Precip NDJFM 73 
37 Mesa Alta Tree ring Tree-ring width -106.6 36.2 1000-2007 0.13 Precip ONDJFMAMJ 74 
38 Finland Tree ring Tree-ring MXD 25.0 68.0 1000-2006 -0.11 SAT JJA 75 
39 Hill 10842 Tree ring Tree-ring width -114.2 38.9 1000-1984 0.20 sat* Annual 76 
40 Springs Mountains Lower ___ Tree ring Tree-ring width -115.7 36.3 1000-1984 0.11 SAT Annual 77 
41 S. Colorado Plateau | Tree ring Tree-ring width -110.0 37.0 1000-1987 0.18 Precip October-July 78 
42 S. Colorado Plateau II Tree ring Tree-ring width -110.0 37.0 1000-1996 -0.11 SAT Annual 78 
43 Four Holes Swamp Tree ring Tree-ring width -80.4 33.2 1001-1985 -0.10 Precip MAMJ 79 
44 Taimyr - Putoran Tree ring Tree-ring width 103.0 713 1000-1996 -0.11 SAT Annual 80 
45 Tatra Region Tree ring Tree-ring width 20.0 49.0 1040-2011 -0.12 SAT May-June 81 
46 Lauenen+ div. Stao Tree ring Tree-ring width Tee 46.4 1000-1976 0.14 SAT JJA 82 
47 Wild Horse Ridge Tree ring Tree-ring width “111.1 39.4 1000-1985 0.19 sat* Annual 83 
48 Mammoth Creek Tree ring Tree-ring width -112.7 37.7 1000-1989 0.18 sat* Annual 84 


* Proxies participating in the NAO,,. reconstruction are highlighted in bold. 

+ Isotope values are referred to Vienna Standard Mean Ocean Water (VSMOW). 

{ These proxy records are sensitive to negative changes in the related variable. 

Data are from refs 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84. LON, longitude; LAT, latitude; SAT, surface air 
temperature. 
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Extended Data Table 2 | Description of the reanalyses. 


Reanalysis Time Interval Analysed 
NCEP/NCAR Reanalysis 1 1949-2008 
Twentieth Century Reanalysis V2 1872-2010 
ERA Interim 1980-2011 
ERA40 1959-2001 


Horizontal Resolution 

Global grid (2.5° x 2.5) 

T62 Gaussian grid (~2.0° x 2.0°) 

7255 Gaussian grid (~0.75° x 0.75°) 
T159 Gaussian grid (~1.125° x 1.125°) 


Data are from refs 85, 86, 87, 88. NCEP, National Centers for Environmental Prediction; NCAR, the National Center for Atmospheric Research. 
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Reference 
Ref. 85 
Ref. 86 
Ref. 87 
Ref. 88 


Extended Data Table 3 | Description of the simulations. 


Simulation* PMIP3/CMIP5 family of experiments 
BCC-CSM1-1 past1000(r1i1p1)+historical(r1i1p1) 
CCSM4 past1000(r1i1p1)+continuation 
CSIRO-MK3L-1-2 past1000(r1i1p1)+continuation 
FGOALS-gl past1000(r1i1p1) 

GISS-E2-R past1000(r1i1p121)+continuation 
IPSL-CM5A-LR past1000(r1i1p1)+continuation 
MIROC-ESM past1000(r1i1p1)+continuation 
MPI-ESM-P past1000(r1i1p1)+continuation 


Time Span 
850-2005 CE 
850-2005 CE 
851-2005 CE 
1000-1999 CE 
850-2005 CE 
850-1999 CE 
850-2005 CE 
850-2005 CE 
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Reference 
Ref. 89 
Ref. 90 
Ref .91' 
Ref. 92 
Ref. 93 


Ref. 94 
Ref. 95 


* Further information on the resolution and forcings used in the past 1,000 simulations can be found at https://wiki.|sce.ipsl.fr/pmip3/doku.php/pmip3:database:status#past1000_experiment_status_and_be 


‘This article describes a different simulation, but specifies the differences with respect to the PMIP3 run. 
Data are from refs 89, 90, 91, 92, 93, 94, 95. 
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Extended Data Table 4 | Principal-component (PC) contributions to 
the ensemble reconstructions. 


Simulation PC1 PC2 PC3 PC4 PC5 


NAOcc 100 98 49 11 2 
NAOmc 100 0 0 0 0 
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Extended Data Table 5 | Description of the NAO/sea-level pressure (SLP) records used for validation. 


Reference 
Ref. 32 
Ref. 33 
Ref. 34 
Ref. 35 
Ref. 36 
Ref. 24 
Ref. 7 


Data sources* 

Instrumental pressure data 

Instrumental, documentary and proxy data 
Instrumental SLP and wind from ship logs 
5 tree-ring and 2 ice cores 

Instrumental, documentary and proxy data 
367 records (tree-rings and ice cores) 


1 speleothem and 1 tree-ring chronology 


* Only the two proxies used in ref. 7. have been included in our analysis. 
Data are from refs 7, 24, 32, 33, 34, 35, 36. ce, Common Era, equivalent to ab. 


Regions represented 

Eastern North Atlantic/ Europe 

Eastern North Atlantic/ Europe 

Eastern North Atlantic/ Europe 
Morocco/Finland/Greenland 

Eastern North Atlantic/ Europe 
Europe/Greenland/Eastern North America 


Scotland/Morocco 
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Validation period 
1781-1822 CE 
1659-1822 CE 
1751-1822 CE 
1429-1822 CE 
1500-1822 CE 
1400-1822 CE 
1049-1822 CE 
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Extended Data Table 6 | The 11 largest volcanic eruptions between ap 1049 and ap 1969 in three alternative reconstructions. 


Reconstruction Reconstructed variable List of volcanoes 

Ref. 28 Global aerosol optical depth at 550 nm* 1229, 1258, 1286,1456,1600,1641,1674, 1696, 1809, 1816,1884 
Ref. 29 Global total stratospheric sulfate aerosol injection (in Tg)t 1167,1227,1258,1275,1284,1452,1600,1641,1783,1809,1815 
Ref. 30 Antarctic volcanic sulfate (in kg/km’)* 1169,1229,1257,1276,1285,1344,1458,1600,1694,1809,1815 


* Data from http://www1.ncdc.noaa.gov/pub/data/paleo/climate_forcing/volcanic_aerosols/crowley2013/crowley2013aod-reff.txt 
+ Revised version from http://climate.envsci.rutgers.edu/IVI2/IVI2Totallnjection_501-2000Version2.txt 
¢ Data available in the supplement of ref. 30. 
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Extended Data Table 7 | Eleven large volcanic eruptions common to the three alternative reconstructions. 


“Volcano. ~—SCountry”~=~—~=«éwFrialselecteddate +~+~DatefromRef.28 | + DatefromRef.29 + DatefromRef.30 
Unknown 1229 1229 1227 1229 
Samalas Indonesia 1257* 1258 1258 1257 
Unknown 1285 1286 1284 1285 
Huaynaputina Peru 1600° 1600 1600 1600 
Parker Philippines 1640° 1641 1641 1641 
Serua Indonesia 1693' 1696 1693 1694 
Unknown 1809 1809 1809 1809 
Tambora Indonesia 1815" 1816 1815 1815 
Cosiguina Nicaragua 1835' 1835 1835 1834 
Krakatau Indonesia 1883" 1884 1883 1884 
Agung Indonesia 1963" 1964 1963 1963 


* Date recently constrained using local deposits and historical records*’. 
+ Date extracted from historical observations*®. 
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Hallucigenia’s head and the pharyngeal armature 


of early ecdysozoans 


Martin R. Smith’ & Jean-Bernard Caron? 


The molecularly defined clade Ecdysozoa’ comprises the panar- 
thropods (Euarthropoda, Onychophora and Tardigrada) and the 
cycloneuralian worms (Nematoda, Nematomorpha, Priapulida, 
Loricifera and Kinorhyncha). These disparate phyla are united by 
their means of moulting, but otherwise share few morphological 
characters—none of which has a meaningful fossilization potential. 
As such, the early evolutionary history of the group as a whole is 
largely uncharted. Here we redescribe the 508-million-year-old 
stem-group onychophoran Hallucigenia sparsa*® from the 
mid-Cambrian Burgess Shale. We document an elongate head with 
a pair of simple eyes, a terminal buccal chamber containing a radial 
array of sclerotized elements, and a differentiated foregut that is 
lined with acicular teeth. The radial elements and pharyngeal teeth 
resemble the sclerotized circumoral elements and pharyngeal teeth 
expressed in tardigrades” °, stem-group euarthropods’” ” and cyclo- 
neuralian worms’’. Phylogenetic results indicate that equivalent 
structures characterized the ancestral panarthropod and, seemingly, 
the ancestral ecdysozoan, demonstrating the deep homology of 
panarthropod and cycloneuralian mouthparts, and providing an 
anatomical synapomorphy for the ecdysozoan supergroup. 
Although Cambrian ecdysozoans offer an unrivalled perspective on 
early ecdysozoan evolution®™, considerable uncertainty surrounds the 
morphology of the ancestral ecdysozoan. One of the few areas of agree- 
ment is that this ancestor bore a pharynx lined with ectodermally 
derived, periodically moulted cuticle’ and opening at a terminal mouth”. 
In many ecdysozoan taxa, the pharynx is lined with sclerotized 
teeth®’®!*">"6, and the mouth is surrounded by circumoral elements. 


The typical cycloneuralian mouth is surrounded by a ring of spines”; 
the tardigrade mouth bears circumoral lamellae'’’*"’; stem-group 
euarthropods (such as Hurdia, Kerygmachela and Jianshanopodia) 
exhibit various lamellae and plates'*’*; and the onychophoran mouth 
is enclosed by pustular lips. These elements have formerly been 
regarded as homologous throughout Ecdysozoa’*’*'**", However, 
the non-sclerotized lips of onychophorans are not strictly circumoral”, 
and onychophorans conspicuously lack pharyngeal teeth’®. This sug- 
gests two possibilities: (1) a foregut armature of circumoral elements 
and pharyngeal teeth did exist in the ancestral ecdysozoan, but was 
secondarily lost in onychophorans; or (2) homoplasious armatures 
arose independently in Panarthropoda (either once or twice, depend- 
ing on panarthropod relationships*”) and Cycloneuralia. 

The earliest history of onychophorans is central to this dilemma. 
The first scenario implies that foregut armature was present in the 
ancestral onychophoran, whereas under the second, onychophorans 
never had foregut armature. To reconstruct the ancestral configura- 
tion of the onychophoran foregut, we turned to the lobopodian 
Hallucigenia sparsa**, now regarded as a stem-group onychophoran’®. 
Until now, this taxon’s potential significance for early ecdysozoan 
evolution has been curtailed by uncertainty in its morphological 
interpretation: Hallucigenia has variously been reconstructed on its 
side, upside down and back to front (Extended Data Table 1). New 
material (Supplementary Table 1) and high-resolution microscopic 
analysis reveals many anatomical features in Hallucigenia for the first 
time. In particular, robust carbonaceous elements occur around 
Hallucigenia’s mouth and along its pharynx, implying that the ances- 


Figure 1 | Optical images of Hallucigenia sparsa from the Burgess Shale 
(anterior to the left). a, ROM 62269 (see also Extended Data Fig. 5); b, ROM 
63142; scanning electron microscope (SEM) images are provided in Fig. 2g-1 
and Extended Data Fig. 3c; c, ROM 63051; see also Extended Data Fig. 3b, d-g; 
d, ROM 63146; high-magnification images are provided in Fig. 2a-f and 
Extended Data Fig. 7; e, NUNH 198658; see also Extended Data Fig. 2b, c; 

f, g, anterior section of ROM 57168; see also Extended Data Fig. 1c-e. 
Acronyms for all figures: A, appendages; Ac, aciculae; An, anus; Bc, buccal 


chamber; C, claw; Ce, circumoral elements; Cs, circumoral structure; Df, decay 
fluids; E, eyes; F, foregut; G, gut; |, left; Mo, mouth opening; r, right; S, spines; 
Aj-n or Sj_,, order of A or S from front to back. Dotted white lines identify 
areas enlarged in Fig. 2 and Extended Data figures, as denoted in captions. 
Unbroken white lines in b-d represent edges of the composite images of both 
parts and counterparts superimposed together. Black and white arrowhead 
denotes image flipped horizontally. Scale bars, 5 mm (a-e), 0.5 mm (f, g). 


Department of Earth Sciences, University of Cambridge, Cambridge CB2 3EQ, UK. @Department of Natural History (Palaeobiology Section), Royal Ontario Museum, Toronto, Ontario M5S 2C6, Canada. 
3Departments of Ecology and Evolutionary Biology and Earth Sciences, University of Toronto, Toronto, Ontario M5S 3B2, Canada. 
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Figure 2 | Scanning electron micrographs of the head region of Hallucigenia 
sparsa from the Burgess Shale. Anterior is to the left except for d-f, where 
anterior is to the top. a-f, ROM 63146 (see Fig. 1d) with sketches of anterior 
region (b) and circumoral elements (e); g-], ROM 63142, part (g-h) and 


tral onychophoran—and seemingly the ancestral ecdysozoan—bore 
circumoral elements and pharyngeal teeth. 

Hallucigenia’s tubular body ranges from 10 mm to more than 
50mm in length (Extended Data Fig. la-c and Supplementary 
Table 2). It bears ten elongate ventrolateral appendages (Fig. 1a-e); 
the anterior eight are of uniform length, whereas the posterior two are 
progressively shorter (Fig. 1d, e and Extended Data Fig. 2a—c). The final 
pair of appendages is terminal, confirming the absence of a posterior 
extension of the trunk’. The third to tenth appendage pairs are regularly 
spaced; the first, second and third appendage pairs are twice as close 
together (Fig. 1a, b, e and Extended Data Figs 1c, 3a, b, 4e and 5a). The 
anterior three pairs of appendages are 1.5-2.0 times narrower than 
the posterior seven, and lacked claws. These narrow appendages were 
flexible and long enough to reach the mouth (Fig. 1a, e-g and Extended 
Data Figs 1c, d, 2d, 3a, 4a and 6e, f). The posterior seven appendage 
pairs are legs with terminal claws: two claws are present on appendages 
four to eight, forming an acute angle (Fig. la-d and Extended Data 
Fig. 3c, d, g), whereas a single claw adorns appendages nine and ten. 

Seven pairs of equally spaced elongate spines occupy the dorsolat- 
eral pinnacles of the trunk, situated above the third to ninth appendage 
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counterpart (i-1), showing aciculae. See Fig. 1 for acronyms and symbols. 
Detector mode: a, secondary electron; c-k, backscatter. Scale bars, 200 um 
(a-c, g, i, j), 50 um (d-h), 20 tm (f, k-1). 


pairs (Fig. la—e). The spines in each pair are separated by 60-90° 
(Extended Data Figs 1, 4 and 7). Each spine is supported by a buttress 
of soft tissue which forms a hump-like swelling of the body wall 
and is particularly prominent in larger individuals (Fig. 1d and 
Extended Data Figs la, c, e and 6). The spines are uniform in length, 
width, spacing and shape: they are not quite straight but curve slightly 
(3.5° + 0.9°) towards the posterior. The spines are centrifugally 
arranged in lateral view: the more anterior spines tilt forwards; the 
rear spines tilt backwards. The construction of the spines and claws 
from stacks of nested elements has been reported elsewhere”®. 

The character of the trunk changes markedly at the position of the 
first pair of spines. Behind this point, the trunk exhibits a uniform 
girth. (A linear relationship between trunk girth and body length 
indicates isometric growth; see Supplementary Table 2.) In front of 
the first spine pair, the trunk is a third narrower than the posterior 
trunk, with a bulbous anterior expansion evident in smaller specimens 
(Fig. la-e and Extended Data Figs 1-8). The anterior trunk usually 
bends at its midpoint, orienting the mouth opening ventrally. 

Approximately 500 jim from the anterior of the body and 100 jim 
from the sagittal axis lies a dorsal pair of convex carbonaceous 
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Figure 3 | Anatomical drawings of Hallucigenia sparsa. a, Lateral profile; 
b, dorsal profile; c, frontal profile; d, e, head in dorsal (d) and lateral (e) views, 
corresponding to regions indicated in b and a, respectively; f, full reconstruc- 
tion. Drawings, reproduced with permission, by Danielle Dufault. See Fig. 1 
for acronyms. 


impressions, reaching 200 1m in diameter, which we interpret as eyes 
(Fig. 2a-c, i, j and Extended Data Figs 3, 5, 7 and 8b-d, i-m). Their 
irregular surface (Fig. 2c and Extended Data Figs 3e, 5f and 8j, m) 
argues against the presence of ommatidia; the eyes were presumably 
simple rather than compound. This seems to be consistent with the 
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Figure 4 | Ecdysozoan phylogeny, showing most parsimonious character 
distribution of circumoral structures (dark blue) and pharyngeal teeth (light 
blue). Fitch parsimony indicates the presence of both these structures in the 
ancestral ecdysozoan; a scenario positing multiple independent innovations of 
this armature would be less parsimonious. The topology shown denotes the 
strict consensus of all most parsimonious trees recovered under implied 
weights with concavity constant (k) between 0.46 and 211, after the removal of 
Orstenotubulus. A newly retrieved ‘hallucishaniid’ clade—diagnosed by a 
swollen head, dorsal spines, and the differentiation of the anterior trunk and 
trunk appendages—includes luolishaniids, Orstenotubulus (not shown) and 
Carbotubulus within a paraphyletic “Hallucigenia’. Illustrated taxa are in bold 
type; see discussion of transformation series 9 and 13 in Supplementary Note 1. 
For phylogenetic data and full results, see Supplementary Data. 


eyes of other lobopodians (Supplementary Note 1, transformation 
series 18). 

Reflective or darker regions occur along the axes of well-preserved 
appendages and appear, in the manner typical of lobopod limbs”, to 
represent extensions of the hydrostatic body cavity (Fig. le). A large 
ampulla-shaped structure that opens anteriorly represents a buccal 
chamber or ‘mouth’ (Fig. 1f, g and Extended Data Figs 1d, 2f, g, 4b, f 
and 8f, g), and is followed by a foregut that consistently occupies the 
central 50% of the anterior trunk (Fig. le and Extended Data Figs 1d, 
2f, g, 4, 6, 7 and 8a, k). The foregut is darker than the surrounding 
tissue, conceivably indicating the presence of a cuticular lining. At the 
end of the head, the foregut widens into a broader, poorly preserved 
midgut (Fig. le and Extended Data Figs 2b, 4 and 6); the gut ends ina 
terminal anus (Extended Data Fig. 2b), through which decay fluids— 
represented by a darkly stained region of variable extent (Fig. 1b, e and 
Extended Data Figs 2a-c, 3a, b and 6a—-d)—were expelled. Preservation 
of the hindgut is inadequate to determine whether it was differentiated 
from the midgut. 

From behind the buccal chamber to the first pair of appendages, the 
dorsal surface of the foregut lumen is lined with dozens of posterior- 
directed aciculae (Fig. 2g-1 and Extended Data Fig. 4c, d). These 
robustly carbonaceous structures are 10 j1m long and gently curved; 
their consistent size and orientation, uniform distribution, and absence 
elsewhere in the gut excludes the possibility that they represent gut 
contents; rather, they were biologically associated with the gut wall. 
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At the back of the buccal chamber, around 200 tm from the anterior 
termination of the trunk, lies a 250-tm-wide crescentic structure com- 
posed of multiple identical lamellae, each around 10 um across and 
60 ttm long. Lamellae are evident in every structure that is preserved, 
and consistently display a radial arrangement (Fig. 2a-f, i, j and Extended 
Data Figs 5c, d and 8j-m). The structure is preserved laterally; it origin- 
ally constituted a ring of lamellae around the opening of the foregut. 

Like the claws and spines, the radial lamellae are preserved as 
discrete carbonaceous films—they were originally sclerotized, rather 
than representing soft tissue such as muscle, cuticular folds, or pig- 
mentation, and they do not represent a taphonomic artefact. The 
lamellae are fundamentally unlike the modified pair of claws that form 
the jaws of modern onychophorans. Insofar as they are numerous, 
elongate, and sclerotized, and are arranged radially around the anterior 
opening of the foregut, the lamellae convincingly resemble the circu- 
moral elements present in other ecdysozoans (see discussion in 
Supplementary Note 1, transformation series 9). To evaluate the evolu- 
tionary significance of this similarity we incorporated our observations 
(summarized in Fig. 3 and Supplementary Videos 1 and 2) into an 
updated phylogenetic matrix (Supplementary Data). 

The reconstruction of character states through Fitch parsimony 
indicates that sclerotized circumoral elements were present in 
the ancestral ecdysozoan (Fig. 4 and Supplementary Note 1, trans- 
formation series 9), supporting the homology between circumoral 
structures in Tardigrada®’* and stem-euarthropods’®''’**° and the 
circumoral (‘coronal’) spines of cycloneuralians'*”°”® (see discussion 
in Supplementary Notes 1 and 2, transformation series 9). Homology 
between the panarthropod pharynx and the cycloneuralian pharynx is 
corroborated by the presence of robust sclerotized teeth in the anterior 
pharynx (Fig. 4 and Supplementary Note 1, transformation series 13), 
previously reported in extant cycloneuralians, euarthropods and tar- 
digrades”’*’°”’ and now also evident in stem-group onychophorans. 
The simple construction of the modern onychophoran foregut there- 
fore reflects a secondary loss of cycloneuralian-like pharyngeal teeth 
and circumoral elements in the onychophoran stem lineage, and 
stands in marked contrast to the complex armoured foregut of the 
ancestral ecdysozoan. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Fossil materials. Materials are deposited at the Royal Ontario Museum, Toronto 
(ROM) and the Smithsonian Institution National Museum of Natural History, 
Washington DC (NMNH). Sediment covering parts of certain ROM specimens 
was mechanically removed using a tungsten-tipped micro-engraving tool. 
Specimens were photographed under various lighting conditions including dark- 
and bright-field illumination and polarized light, and imaged by backscatter and 
secondary electron microscopy under variable pressure. 

Taphonomic considerations. As with other Burgess Shale organisms 
Hallucigenia sparsa exhibits various degrees of pre- and post-burial decay, ranging 
from disarticulated specimens represented only by pairs of decay-resistant spines 
(Extended Data Fig. 9a), through partly disarticulated specimens retaining parts of 
the body (Extended Data Fig. 9b), to complete specimens whose curled appen- 
dages and trunks are consistent with post-mortem contraction following rapid 
burial of live organisms (Fig. la~e and Extended Data Figs 1-8). Consequently, the 
widths of the trunk and appendages are subject to slight taphonomic variation 
within and between specimens (as in, for example, Fig. 1). The full length of the 
body and appendages, where preserved, is typically buried within the matrix and is 
difficult to prepare mechanically. 

Phylogenetic analysis. Phylogenetic analysis was conducted using the methods of 
Smith and Ortega-Hernandez*; in summary, parsimony analysis was performed in 


28,29 
> 


LETTER 


TNT”? under a range of weighting parameters, with Goloboff’s concavity con- 
stant”! ranging from k = 0.118 to 211, and under equal weights (k = 2). Code is 
available in the Supplementary Data. Orstenotubulus (80% tokens ‘ambiguous’ or 
‘inapplicable’) was identified as a wildcard taxon with an unconstrained position 
within the hallucishaniids; to improve resolution it is omitted from the strict 
consensus trees presented in the main manuscript. 
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Extended Data Figure 1 | Hallucigenia sparsa from the Burgess Shale. a, b, largest (a, ROM 57169) and smallest (b, ROM 62093) specimens, to the same scale; 
c, ROM 57168, with enlargements of the anterior (d) and mid-trunk (e). Acronyms as in Fig. 1. Scale bars, 5 mm. 
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Extended Data Figure 2 | Hallucigenia sparsa from the Burgess Shale. e, enlargement of region marked in d; f, g, backscatter SEMs of regions marked 
a, ROM 63139, showing posterior body termination; b, c, NUNH 198658, in e. Acronyms as in Fig. 1. Scale bars, 5 mm (a-d), 1 mm (e), 0.5 mm (f), 
showing posterior termination (see also Fig. le); d-g, ROM 63143: 0.1 mm (g). 
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Extended Data Figure 3 | Hallucigenia sparsa from the Burgess Shale. incorporating part and counterpart of the entire specimen; d, anterior section; 
a, c, ROM 63142: a, composite image incorporating part and counterpart of e, f, eyes; g, claw pair. c-e are backscatter electron micrographs. Acronyms 
the entire specimen; ¢, claw pair. b, d-g, ROM 63051: b, composite image as in Fig. 1. Scale bars, 5 mm (a, b), 500 jum (d), 50 um (¢, f, g), 20 jum (e). 
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Extended Data Figure 4 | Hallucigenia sparsa from the Burgess Shale. inc, d.e, f, ROM 61143: anterior region marked in e is enlarged in f. Acronyms 
a-d, ROM 61513: a, entire specimen; b-d, enlargements of anterior region, as in Fig. 1. Scale bars, 5 mm (a, e), 1 mm (b, f), 200 um (c), 20 jum (d). 
showing mouth opening, aciculae and eyes; mouth opening to right in b, to left 
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Extended Data Figure 5 | Hallucigenia sparsa (ROM 62269) from the Burgess Shale. a, part; b, counterpart, anterior section, showing eyes; c, d, eyes and 
mouthparts (backscatter SEM); e, f, detail of eyes (counterpart). Acronyms as in Fig. 1. Scale bars, 1 mm (a), 400 um (b), 200 Lm (e), 100 um (c, d), 20 um (f). 
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Extended Data Figure 6 | Hallucigenia sparsa from the Burgess Shale. a~d, NMNH 83935 (holotype): in contrast to body tissue, decay fluids lack a sharp 
margin and are non-reflective; e, f, ROM 57776, showing full length of appendage one. Acronyms as in Fig. 1. Scale bars, 5 mm. 
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Extended Data Figure 7 | Hallucigenia sparsa (ROM 63146), composite image of part and counterpart. Acronyms as in Fig. 1. Scale bar, 5 mm. 
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Extended Data Figure 8 | Hallucigenia sparsa from the Burgess Shale. marked in i, showing eyes and mouthparts, with interpretative diagram. 
a-d, NMNH 193996: b, c, enlargements of area boxed in a; c, secondary k-m, ROM 63140: 1, backscatter SEM of head, showing right eye and 


electron micrograph; d, backscatter electron micrograph of region marked in —_ mouthparts (enlarged in m, with interpretative diagram). Acronyms as in 
c. e-g, ROM 63141, showing position of mouth. h-j, ROM 63144: i, secondary _ Fig. 1. Scale bars, 10 mm (k), 5 mm (a, e, h), 1 mm (b, ¢, 1), 0.5 mm (i), 0.1 mm 
electron image of region marked in h; j, backscatter electron image of region _—(d, j, m). 
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Extended Data Figure 9 | Hallucigenia sparsa from the Burgess Shale. a, ROM 43045, cluster of dissociated specimens; b, ROM 63145, dissociated specimen 
showing spines in close anatomical position. Acronyms as in Fig. 1. Scale bars, 10 mm. 
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Extended Data Table 1 | Interpretations of Hallucigenia through time 


Conway i _ Hou & 
R kéld & R kdld Stei t al. 
Authority: Walcott 1911°2 Morris ae 42 ae Bergstrom pene s This study 
1977" 1995" 
Spines: Lateral (parapodia) Ventral Dorsal Dorsal Dorsal Dorsal Dorsal 
Trunk appendages: Not interpreted Single Paired? Paired Paired Paired Paired 
Slender legs at: Not observed Posterior Posterior Anterior Posterior Anterior Anterior 
Pairs of slender legs: Not observed 3 3 3 ipa mn a aun Me 3 
fortis hongmeia) 
: : F Narrow ‘neck’ 
Width of Sunk Beyond Not observed Constant Herevitig Narrowing Bulge Bulge with terminal 
slender legs: slightly slightly 
bulge 
Vertical trunk tt 
ibhaeebads e aa Not observed Curved Straight Straight Straight Straight Curved ventrally 
beyond slender legs: 
aa : Beyond F Below end : ‘ ‘ ; 
Position of first slender leg: Not observed spine Beyond spine spine Beyond spine Beyondspine Below first spine 
(Pairs of) legs beyond ; F ‘ : ‘ : 
spines at opposite end: Not observed One [pair] One pair One pair One pair One pair One pair 
Body continues beyond legs 
at end opposite slender Not observed Yes Yes No Yes Yes No 


legs: 


References 32-35 are listed in the Methods section. 
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Sex reversal triggers the rapid transition from genetic 
to temperature-dependent sex 
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Sex determination in animals is amazingly plastic. Vertebrates 
display contrasting strategies ranging from complete genetic 
control of sex (genotypic sex determination) to environmentally 
determined sex (for example, temperature-dependent sex deter- 
mination)’. Phylogenetic analyses suggest frequent evolutionary 
transitions between genotypic and temperature-dependent sex 
determination in environmentally sensitive lineages, including 
reptiles’. These transitions are thought to involve a genotypic sys- 
tem becoming sensitive to temperature, with sex determined by 
gene-environment interactions’. Most mechanistic models of 
transitions invoke a role for sex reversal*°. Sex reversal has not 
yet been demonstrated in nature for any amniote, although it 
occurs in fish and rarely in amphibians”*. Here we make the first 
report of reptile sex reversal in the wild, in the Australian bearded 
dragon (Pogona vitticeps), and use sex-reversed animals to experi- 
mentally induce a rapid transition from genotypic to temperature- 
dependent sex determination. Controlled mating of normal males 
to sex-reversed females produces viable and fertile offspring whose 
phenotypic sex is determined solely by temperature (temperature- 
dependent sex determination). The W sex chromosome is elimi- 
nated from this lineage in the first generation. The instantaneous 
creation of a lineage of ZZ temperature-sensitive animals reveals a 
novel, climate-induced pathway for the rapid transition between 
genetic and temperature-dependent sex determination, and adds 
to concern about adaptation to rapid global climate change. 

Sex determination is the regulatory process that initiates differenti- 
ation of the gonads in the early embryo to form either testes or ovaries. 
Very many reptiles have temperature-dependent sex determination 
(TSD), whereby the temperature that eggs experience in the nest deter- 
mines the sex of offspring. Others have male or female heterogamety, 
either XX/XY systems as in mammals or ZZ/ZW systems as in birds, 
with or without strongly heteromorphic sex chromosomes’. Mounting 
evidence suggests that genotypic and environmental modes of sex 
determination are not mutually exclusive dichotomous strategies’. 
Many species have differentiated sex chromosomes, but also show a 
temperature override, where genes and environment interact to deter- 
mine sex’®’*, Furthermore, the great diversity of sex-determining 
mechanisms seen in reptiles, matched in amphibians and fishes, but 
not mammals and birds, shows poor respect for phylogeny, implying a 
complex evolutionary history of multiple transitions among sex deter- 
mination modes””. 

The widely distributed Australian central bearded dragon (P. vitticeps) 
was one of the first reptile species in which a temperature override 
was observed!*!*, and the first in which it was demonstrated 
genetically’. P. vitticeps has a female heterogametic system of sex 
determination with ZZ males (ZZm) and ZW females (ZWf)’*, but 
high incubation temperatures experimentally feminize chromoso- 
mally male animals and produce sex-reversed females (ZZf)’”. To 
identify sex-reversed females, we developed a new robust sex-specific 


molecular marker from previously characterized P. vitticeps sex 
chromosome sequences'*”” and validated the test against a panel of 
unrelated individuals incubated at temperatures where phenotypic 
sex and genotypic sex are concordant (20 ZZm and 20 ZWf). 
Absence of a W chromosome in putatively sex-reversed ZZ females 
was confirmed cytogenetically to eliminate the possibility of low- 
frequency recombination being mistakenly interpreted as sex reversal 
(Extended Data Figs 1-3). 

Animals were sampled from several widely distributed populations. 
Application of the PCR sex marker to 131 wild-caught individuals 
identified 11 sex-reversed ZZ females occurring towards the northern 
end of the P. vitticeps range, near the border of Queensland and New 
South Wales (Fig. 1). Although sex reversal in reptiles has been 
demonstrated under laboratory conditions’®"”’, this is the first time 
that sex reversal has been shown to occur naturally in a wild popu- 
lation of reptiles, or indeed any amniote. Sex reversal was widespread 
in this population, with instances distributed over a total area of 
23,650 km? in remote semi-arid Australia. Among wild phenotypic 
females, the proportion of ZZ sex-reversed females increased each 
year over the study, from 6.7% in 2003, to 13.6% in 2004, to 22.2% 
in 2011 (Fig. 1), suggestive of a trend but not significant (x7 = 1.65, 
d.f. = 2, P = 0.44). 

The sex-reversed females were viable and fertile. In fact, our 
sex-reversed females laid significantly more eggs per year (mean 


A @ Sex-reversed ZZ female 
@e @ Normal ZW female 
O Normal ZZ male 
@ 
1@) e e . 
joy eo Phenotypic females 
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0 
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Figure 1 | Geographical distribution of sex reversal in wild populations of 
P. vitticeps. Location of sex-reversed ZZ females (ZZf) across years is indicated 
by red circles (N = 11), normal ZW females (ZWf) by black circles (N = 72) 
and normal ZZ males (ZZm) by grey circles (N = 48). Pie charts indicate 

the relative proportions of ZZf and ZWf in years where sample size exceeded 15 
phenotypically female individuals. The temporal trend is suggestive, but not 
significant (y* = 1.65, d.f. = 2, P = 0.44). 
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Figure 2 | Offspring sex ratio as a function of egg incubation temperature in 

P. vitticeps. a, GSD system of sex determination with a high-temperature 

override. Data are from ref. 12 (open circles) and this study (filled circles). 

Proportion of phenotypically female offspring from control ZZm * ZWf 

crosses is a function of constant incubation temperature T, given by 

( i= 0.5033) e! 87237 — 63.5904 
1 + el 875T— 63.5904 


Pr{f}=0.50334 . b, Functional TSD by sex 


reversal. Proportion of phenotypically female offspring from ZZm X ZZf 
52.9999 + 1.5866T 

1 fe 52-9999 -+ 1.5866" 

the fitted curve beyond the data. Shaded regions, 95% confidence limits. The 

number of individuals in each treatment is shown. 


crosses is given by Pr{f} = Dashed lines, extrapolation of 


ZZ£ = 47.3 eggs, N = 6 females) than ZW females of an equivalent age 
(mean ZWf = 24.5 eggs, N = 11 females) (P< 0.05). 

Four wild-caught and three captive-bred’? sex-reversed females 
(ZZf) were mated with normal males (ZZm) under controlled labor- 
atory conditions, to yield 389 eggs from 21 clutches. As expected from 
ZZm X ZZf matings, eggs incubated at the low temperature of 28 °C 
produced ZZ hatchlings, which were all male (N = 35 hatchlings, two 
complete clutches, 87.5% hatching success). Parentage analysis using 
2,229 single nucleotide polymorphism (SNP) markers in a subset of 
individuals confirmed that eggs laid by sex-reversed females were the 
product of sexual reproduction and not facultative parthenogenesis 
(Supplementary Table 1), a phenomenon that is uncommon, but 
known in some squamates'*”°. 

Following our experiments at the low incubation temperature of 
28°C, we then compared the offspring sex ratio of sex-reversed 
females (74 ZZf eggs) and control females (130 ZWf eggs) across a 
range of temperatures. Hatching success was high both for sex- 
reversed and for control matings (95.9% and 85.4%, respectively) 
(Supplementary Table 2). Offspring sex ratios from the control 
matings confirmed the existence of a ZW chromosomal temperature 
override as previously described’ (Fig. 2a) (probit regression, Wald 
~ = 14.13, df. = 1, P< 0.0002). Specifically, chromosomal influence 
over sex determination was dominant from 22 °C to 32 °C, where the 
proportion of males and females was equal (estimate of unconstrained 
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Figure 3 | Rate of sex reversal as a function of egg incubation temperature 
in P. vitticeps. Offspring of sex-reversed mothers (ZZf shown in red) are 
reversed more frequently and at a lower temperature than the offspring of 
control mothers (ZWf shown in black), implying that temperature sensitivity is 
variable in the population and heritable. Vertical bars, standard error of the 
observed proportion. Dashed lines, extrapolation of the fitted curve beyond the 
data (see Methods for equations). Dotted lines, the pivotal temperature at 
which half of ZZ offspring are reversed. Sample size (numbers) is the total 
number of ZZ individuals. 


lower asymptote = 0.503; Fig. 2a). Temperature began to interact with 
and override chromosomal sex determination (causing sex reversal) 
above 32 °C, reaching almost complete female bias at high tempera- 
tures (proportion female at 36°C = 0.96) (Fig. 2a; Supplementary 
Table 2). In contrast to the control matings, the offspring sex ratios 
produced by sex-reversed females followed the pattern of a TSD spe- 
cies with no chromosomal influence over sex (logistic regression, Wald 
7° = 8.09, df. = 1, P< 0.005) (Fig. 2b and Supplementary Table 2). In 
the absence of a W chromosome, the female phenotype is possible only 
via sex reversal’? (Fig. 3 and Supplementary Table 3). Thus, sex- 
reversed mothers produced only male hatchlings at low and intermedi- 
ate incubation temperatures (26 °C and 28 °C), but at 33 °C produced 
female as well as male offspring (proportion female = 0.28) and at 
34 °C produced offspring that were predominantly female (proportion 
female = 0.75) (Fig. 2b). 

Logistic regressions of rate of sex reversal against temperature were 
significant for the offspring of both normal ZZm x ZWf crosses 
(Wald va = 8.14, df = 1, P<0.005) and sex-reversed ZZm X ZZf 
crosses (Wald 7” = 8.09, d.f. = 1, P< 0.005) (Fig. 3 and Supple- 
mentary Table 3). However, offspring of sex-reversed mothers have 
themselves a greater temperature sensitivity, compared with offspring 
of normal ZW mothers (77 = 55.39, P< 0.0001) (Fig. 3). The pivotal 
temperature for offspring of ZW mothers was 34.7 °C whereas for the 
offspring with sex-reversed ZZf mothers it was lower, at 33.5°C 
(Fig. 3). 

Our discovery of naturally occurring sex reversal has afforded us the 
unique opportunity to conduct ZZm X ZZf matings that can yield only 
ZZ offspring, whose sex is determined entirely by incubation temper- 
ature. The W chromosome was thus eliminated from our sex-reversed 
lineage in one generation, and sex of the offspring was determined by a 
mode indistinguishable from TSD. Thus, we have used high-temper- 
ature incubation to experimentally induce, within a lineage, a trans- 
ition from a predominantly genotypic system with heteromorphic sex 
chromosomes’” to a temperature-dependent sex determination sys- 
tem. The homomorphic sex chromosomes in ZZ male and female 
offspring have effectively become autosomal. Although loss of the W 
chromosome can be induced in a single generation under laboratory 
conditions, in wild populations the W chromosome is expected to 
persist for multiple generations and the transition in sex determination 
modes be more gradual. 
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This experimental transition from genotypic sex determination 
(GSD) to TSD demonstrates a novel transitional pathway, in which 
TSD can evolve rapidly in response to extreme environmental condi- 
tions (high temperatures), without requiring that there be sex-specific 
selective advantages. This observation challenges conventional theory 
on the drivers of transitions from GSD to TSD. Our stochastic model 
of the transition from GSD to TSD provides an alternative mechanism 
to the prevailing view that TSD evolves in response to a fitness advant- 
age to the offspring to be female at some temperatures and male at 
others (invoking the Charnov-Bull model)***. Until now it has not 
been possible to disambiguate sex-specific fitness differences that 
could cause a GSD to TSD transition from fitness differences that arose 
after the transition. The Charnov-Bull model still is key in explaining 
the maintenance of TSD in populations; however, we show here that 
there are multiple pathways for populations to achieve this transition 
(stochastic and evolutionary). In our case, optimization of male and 
female fitness at the different temperatures they experience probably 
follows and reinforces the GSD to TSD transition, rather than being 
the proximal cause of the evolutionary transition. 

Our demonstration that TSD can evolve rapidly in response to 
extreme environmental conditions (high temperatures) predicts that 
in the wild TSD may become fixed in small demes in which sex- 
reversed ZZf females mate, and the W is lost stochastically. Drift effects 
may be especially important if the sensitivity of ZZ males to sex 
reversal (the pivotal temperature for sex determination) has limited 
opportunity to evolve, owing to a small number of generations or low 
effective heritability’. Alternatively, the transition to TSD may be 
driven by positive selection’'. Our extraordinary finding that sex- 
reversed females laid nearly twice as many eggs per year than normal 
ZW females of an equivalent age suggests an immediate fitness advant- 
age to sex reversal”* which could drive transitions. Indeed, the greater 
fecundity of sex-reversed females may well combine with their her- 
itable increased propensity to reverse (Fig. 3), exacerbating the over- 
production of females and accelerating the loss of the W chromosome. 

In the absence of drift and positive selection, the W chromosome 
may still be eliminated from populations by Fisher’s frequency- 
dependent selection*”®. An increase in the frequency of ZZ females 
by high-temperature reversal will be favoured by selection for mothers 
who produce only sons (the rarer sex at higher temperatures). 
Modelling this response (see Methods) shows a precipitous decline 
in the frequency of the ZW genotype with increasing incubation tem- 
perature. The W chromosome frequency reduces at temperatures 
above 32.0°C with complete loss, enforced by frequency-dependent 
selection alone, at temperatures above 33.4 °C (Extended Data Fig. 4). 
Our wild population resides on the precipice between GSD with sex 
reversal and TSD arising from the loss of the W chromosome. Thus, 
under climatic conditions where extreme high temperatures are 
experienced at an increasing rate, the relatively rapid loss of the W 
chromosome and the adoption of TSD become increasingly likely via 
the combined effects of drift, positive selection and/or Fisherian sex- 
ratio selection. Consistent with these predictions we observed 
increased rates of sex reversal among wild phenotypic females over 
the course of this study (Fig. 1). 

Our observation that offspring from sex-reversed mothers are more 
frequently reversed and at a lower temperature than the offspring of 
normal ZW mothers (Fig. 3) implies that the eggs of ZZ sex-reversed 
females are more sensitive to temperature. This strongly suggests that 
heritable variation (genetic or epigenetic) exists at the locus that con- 
trols sex determination in P. vitticeps, providing a mechanism for sex 
determination thresholds to evolve. Additional evidence for heritable 
selectable variation in thermosensitivity was observed in a single sex- 
reversed female that produced 100% male offspring at all temperatures 
(28-36 °C; N = 32 eggs; Supplementary Tables 2 and 3; data excluded 
from analyses). 

One of the most important questions for the near future is whether 
organisms will be sufficiently resilient to withstand a rapidly changing 
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climate, or so vulnerable that they will succumb to extinction’””*. We 
provide here an example of how climatic extremes can rapidly and 
fundamentally alter the biology (switches in sex determination mode) 
and the genome (loss of the W chromosome) of climate-sensitive 
reptiles. Thus adverse evolutionary responses, specifically a switch to 
TSD if global temperatures rise, can occur through a combination of 
temperature sensitivity and stochastic processes. This is important 
because temperature-induced extreme sex-ratio bias is thought to be 
an extinction driver exclusively in species with TSD”. However, we 
show here that these risks may extend more broadly. Exposure to high 
temperatures can perturb apparently stable GSD systems, induce a 
rapid transition to TSD and then proceed inexorably towards a highly 
feminized population and thus a greater risk of extinction. 

The key to determining how important transitions in sex-determin- 
ing mode are to species’ survival will be to discover how thermal 
sensitivities can adapt after the transition to TSD. If adaptation is 
rapid, TSD could produce stable unbiased sex ratios, and may even 
be favoured by climate change. For example, a temperature-dependent 
strategy might afford greater control of sex ratio manipulation in an 
unpredictable climate. In this way, reptiles may have greater capacity 
to cope and compensate for climate change than previously appre- 
ciated’’. A high degree of flexibility in sex-determination mode could 
be a powerful and, until now, unappreciated weapon in the arsenal of 
evolutionary responses to an unpredictable climate. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 

Field collection. We collected samples (tail snips or blood) from 131 wild adult 
P. vitticeps individuals, during sampling trips conducted in October 2003, 2004, 
2008, September 2009 and March 2010, 2011. Phenotypic sex was determined by 
the presence or absence of hemipenes visible as two latero-ventral lumps on the tail 
immediately behind the vent and subsequent eversion of hemipenes if present, and 
presence or absence of eggs by palpation, with confirmatory but not definitive 
presence or absence of enlarged femoral pores and larger head size characteristic 
of males. 

Breeding experiments. Captive females were allowed to lay eggs naturally in a 
sand substrate. Eggs were recovered from cages within 14 h and transferred to 
plastic boxes filled with vermiculite with a water potential of —200 kPa (120.0% 
water to vermiculite, by mass)’. Allocation of clutches to experiments was not 
randomized and the investigators were not blinded to allocation during experi- 
ments and outcome assessment. In the first experiment, two whole clutches (21 
and 19 eggs) from a sex-reversed female (ZZf) were selected as they became 
available and incubated at a constant 28 °C (28.1 °C, s.d. +0.6) to evaluate repro- 
ductive viability. In the second experiment, clutches laid both by sex-reversed 
(ZZf) and by control (ZWf) females were chosen randomly then systematically 
allocated across four constant temperature incubation treatments: 26 °C, 28 °C, 
33 °C, 34°C (Supplementary Table 2). An additional 36 °C treatment was con- 
ducted only for the eggs of control females because of lower numbers of available 
eggs from ZZf. Sample sizes were determined by availability of clutches from the 
captive breeding programme. No statistical methods were used to predetermine 
sample size. We have not interpreted non-significant results, where considerations 
of power are more crucial. Eggs were removed from the experiment if they were 
physically damaged (crushed or pierced) or inviable (no white patch or vascular- 
ization) before entering the thermosensitive period when sex is determined” 
(Supplementary Table 4). Whole clutches with egg mortality rates of at least 
40% were excluded from analysis. The phenotypic sex for all captive-bred animals 
was established by hemipene eversion upon hatching”, hemipenal transillumina- 
tion** at 1-4 months and by gross external morphology at 4-9 months of age. 
Female reproductive fitness was estimated as the total number of eggs produced 
per season, a measure that summarizes the effects of clutching frequency and 
clutch size. We compared the reproductive fitness of 6 sex-reversed (ZZf) and 
12 control (ZWf) females using unpaired t-tests. 

Molecular detection of sex reversal. All individuals collected from the wild 
and bred in captivity were genotypically sexed using two PCR primers: H2, 
GCCCATATCTCACTAGTTCCCCTCC; F, CAGTTCCTTCTACCTGGGAGT 
GC, which flank two W-chromosome-specific deletions (150 base pairs and 14 
base pairs) that we identified in published P. vitticeps anonymous sex chromosome 
sequence’® (GenBank accession numbers EU938138.1 and KM508988). PCR con- 
ditions for the novel test were 1X MyTaq HS Red mix (Bioline), 4 1M each primer 
and 50 ng of genomic DNA. Cycling conditions were 95 °C for 5 min; (95 °C for 
20 s, 70-65 °C for 20 s, 72 °C for 1 min) X ten cycles with annealing temperature 
decreased 0.5 °C per cycle; (95°C for 20 s, 65°C for 20 s, 72°C for 1 min) X 30 
cycles; 72 °C for 10 min. PCR products were visualized on a 1.5% agarose gel using 
SYBR Safe (Life Technologies). Two bands amplified in ZW individuals, whereas a 
single control band amplified in ZZ individuals. Individuals showing genotype- 
phenotype discordance were classed as sex-reversed. The rate of sex reversal was 
calculated as the proportion of ZZ individuals with a female phenotype. All 
molecular sex tests were conducted with the investigator blinded to the identity 
and phenotypic sex of the samples. 

Cytogenetics. The accuracy of the PCR sex test was validated using C-banding 
(Extended Data Fig. 1), comparative genomic hybridization (Extended Data Fig. 2) 
and by physically mapping a P. vitticeps W-chromosome-linked microsatellite 
motif (Extended Data Fig. 3). Two wild-type ZW females (001003386049, 
001003342236), two wild-type ZZ males (001003338787, 001003387339) and a 
putative sex reversal female (001003344224) of P. vitticeps were used for cytoge- 
netic analyses. Metaphase chromosome spreads were prepared from fibroblast 
cultures of tail tissue following ref. 33. Metaphases for all individuals were stained 
with 4’,6-diamidino-2-phenylindole (DAPI), the chromosome number identified 
and compared with the normal P. vitticeps karyotype”, to eliminate the possibility 
of chromosome abnormality. C-banded chromosomes were obtained by the 
CBG method (C-bands by barium hydroxide using Giemsa)'***. Comparative 
genomic hybridization was conducted as previously described’*”* using fluores- 
cently labelled male and female genomic DNA. Physical mapping of the 
W-chromosome-linked microsatellite (AAGG)g was conducted using fluor- 
escence in situ hybridization, following ref. 37. For all cytogenetic analyses, the 
presence or absence of W-chromosome-specific signal was scored in eight to ten 
metaphases per individual. 

Statistical analysis. Curves of best-fit offspring responses to incubation temper- 
ature were estimated by applying logistic regressions to the raw data (male = 0; 
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female = 1) using PROC LOGISTIC in SAS Software version 9.1 or, in the case of 
the proportion of females varying from 0.5 to 1, by fitting a logistic regression with 
unconstrained lower asymptote using PROC PROBIT. In both cases, the SAS ODS 
output was used to generate 95% confidence limits around the estimated regres- 
sion lines. Logistic regressions were compared using PROC LOGISTIC with the 
addition of a CLASS variable for maternal genotype (ZZ X ZZ versus ZZ X ZW). 
These analyses have an appropriate error structure for data in the form of counts of 
males and females, and counts of sex-reversed versus concordant individuals. 
Parentage analysis. Parentage analysis was conducted to confirm that the eggs 
produced by sex-reversed females (ZZf) were the product of sexual reproduction 
and not asexual reproduction by parthenogenesis. We genotyped 18 individuals at 
2,229 SNPs using a proprietary reduced-representation sequencing approach 
called DArTseq (Diversity Arrays Technology), alternatively referred to as dou- 
ble-digest restriction-site-associated DNA markers (RAD-seq)**“". Four methods 
of complexity reduction were tested in P. vitticeps (data not presented) and the 
PstI-SphI method was selected. Approximately 2,000,000 sequences per barcode 
per sample were identified and used in marker calling. We subsequently applied 
further stringent quality control measures, requiring 100% reproducibility over 
two independent runs, a minimum of 5X read depth (mean read depth = 32) 
and complete data for all individuals. Specifically, we sequenced three parents (two 
sex-reversed females mated to the same male), nine offspring from the first pair- 
ing, two offspring from the second paring, three unrelated ZW individuals and one 
unrelated ZZ individual. We used these data to estimate mother-offspring 
sequence identity (clonal parthenogenesis hypothesis), the percentage of hetero- 
zygous loci in offspring (non-clonal parthenogenesis hypothesis) and the percent- 
age of parent-offspring allelic mismatches (sexual reproduction hypothesis). For 
comparison, these statistics were also calculated comparing four unrelated indi- 
viduals to the parental genotypes. Parentage hypotheses were tested using t-tests 
(Supplementary Table 1). 

Modelling the decline of the ZW genotype. We modelled the decline of the ZW 
genotype resulting from frequency-dependent selection because of overproduc- 
tion of females through sex reversal with increasing temperature’. Fitness within a 
sex is the same for all genotypes. Let the starting frequency of ZW among zygotes 
be y, the starting frequency of ZZ be z = 1 — y, and let a fraction P,[T] of ZZ 
become reversed to a female phenotype if they have a ZW mother, P2[T] if they 
have a ZZ mother. The equations for P,[T] and P,[T] are the functions of tem- 
perature T given in Fig. 3. In any given generation n, we have the proportion of 
female phenotypes (f,,) equal to the sum of the numbers of normal ZW females 
(Yn), Of sex-reversed ZZ females with ZW mothers (r,,) and of sex-reversed ZZ 
females with ZZ mothers (r,,’), 


Ti =Vyttatn 
The frequency of ZW zygotes and the frequency of ZZ zygotes with ZW mothers 
are equal and both given by 


Yn 
ntl =2Znt1 =e 
Yat. +1 2, 


The frequency of sex-reversed ZZ zygotes with ZW mothers thus given by 
toi =Pi[T]Zn41 


and the frequency of sex-reversed ZZ zygotes with ZZ mothers is given by 


/ (fn Ty. n) 
Vat = P,[T] he 
Determined from our experimental data, the probability of sex reversal for off- 
spring with ZW mother (Fig. 3, shown in black) is given by 


@—63.1402-+ 1.81847 
P,[T]= 1 + e— 63.1402 41.8184 
and the probability of sex reversal for offspring with ZZ mothers (Fig. 3, shown in 
red) is given by 

@—52.9999-+ 1.58667 
P|T|= 1 fe 52.9999 -+ 1.5866T 

We iterate for an equilibrium solution for y for various values of temperature T. 
Overlapping generations will delay the rate of convergence to equilibrium, but will 
not affect the equilibrium value for a particular temperature. 
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Extended Data Figure 1 | C-banded P. vitticeps chromosomes. a, Mitotic putative ZZ sex-reversed individual. No evidence of a W chromosome was 
metaphase chromosomes of a ZW control female individual. Arrowhead detected. c, Mitotic metaphase chromosomes of a control ZZ male individual. 
indicates the presence of a W chromosome identified by dense black staining of | No evidence of a W chromosome was detected. Scale bar, 10 jum. 

a single microchromosome. b, Mitotic metaphase chromosomes of a female 
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Extended Data Figure 2 | Comparative genomic hybridization in with SpectrumOrange-dUTP. b, Mitotic metaphase chromosomes of a female 
P. vitticeps. Genomic DNA was labelled by nick translation incorporating putative ZZ sex-reversed individual. No evidence of a W chromosome was 
SpectrumGreen-dUTP for males and SpectrumOrange-dUTP for females. detected. c, Mitotic metaphase chromosomes of a control ZZ male individual. 
a, Mitotic metaphase chromosomes of a ZW control female individual. No evidence of a W chromosome was detected. d-f, DAPI staining of the 


Arrowhead indicates the presence of a single W microchromosome identified | same metaphases, control ZW female, sex-reversed ZZ female and control 
by the enriched orange fluorescence of female specific genomic DNA labelled ZZ male, respectively. Scale bar, 10 jum. 
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Extended Data Figure 3 | Physical mapping of a W-chromosome-linked a female putative ZZ sex-reversed individual. No evidence of a W chromosome 
microsatellite motif in P. vitticeps. a, Mitotic metaphase chromosomes of a _ was detected. c, Mitotic metaphase chromosomes of a control ZZ male 
ZW control female individual. Arrowhead indicates the presence of a W individual. No evidence of a W chromosome was detected. d-f, DAPI staining 


chromosome identified by a strong hybridization of (AAGG)s-Cy3 florescence _ of the same metaphases, control ZW female, sex-reversed ZZ female and 
(orange) ona single microchromosome. b, Mitotic metaphase chromosomes of _ control ZZ male, respectively. Scale bar, 10 jum. 
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Extended Data Figure 4 | Modelling the decline of the ZW genotype wild population (shown in red, 14.3% sex reversal) resides on the precipice 
resulting from frequency-dependent selection. Frequency of the ZW between GSD and TSD and requires only a small change in environmental 
genotype declines precipitously with increasing incubation temperature.Our temperature to precipitate loss of the W chromosome. 
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Synapse formation is a process tightly controlled in space and time. 
How gene regulatory mechanisms specify spatial and temporal 
aspects of synapse formation is not well understood. In the nem- 
atode Caenorhabditis elegans, two subtypes of the D-type inhib- 
itory motor neuron (MN) classes, the dorsal D (DD) and ventral D 
(VD) neurons, extend axons along both the dorsal and ventral 
nerve cords’. The embryonically generated DD motor neurons 
initially innervate ventral muscles in the first (L1) larval stage 
and receive their synaptic input from cholinergic motor neurons 
in the dorsal cord. They rewire by the end of the L1 moult to 
innervate dorsal muscles and to be innervated by newly formed 
ventral cholinergic motor neurons’. VD motor neurons develop 
after the L1 moult; they take over the innervation of ventral mus- 
cles and receive their synaptic input from dorsal cholinergic motor 
neurons. We show here that the spatiotemporal control of synaptic 
wiring of the D-type neurons is controlled by an intersectional 
transcriptional strategy in which the UNC-30 Pitx-type homeodo- 
main transcription factor acts together, in embryonic and early 
larval stages, with the temporally controlled LIN-14 transcription 
factor to prevent premature synapse rewiring of the DD motor 
neurons and, together with the UNC-55 nuclear hormone receptor, 
to prevent aberrant VD synaptic wiring in later larval and adult 
stages. A key effector of this intersectional transcription factor 
combination is a novel synaptic organizer molecule, the single 
immunoglobulin domain protein OIG-1. OIG-1 is perisynaptically 
localized along the synaptic outputs of the D-type motor neurons 
in a temporally controlled manner and is required for appropriate 
selection of both pre- and post-synaptic partners. 

At the end of the first larval stage, the synaptic outputs from the 
DD motor neurons (MNs) to ventral muscle and their synaptic input 
from cholinergic DA/DB MNs is eliminated and, instead, synapses are 
formed onto dorsal muscle and synaptic input is received from choli- 
nergic VA/VB MNs (Fig. 1a)'. We sought to examine how the spatio- 
temporal specificity of this rewiring process is controlled and 
integrated with other aspects of D-type MN differentiation. To address 
this question, we examined the function of the C. elegans Pitx-type 
homeobox transcription factor UNC-30, which is known to control 
GABAergic neurotransmitter identity of the D-type MNs”’. The ana- 
lysis of serial electron micrographs shows that the synaptic patterns of 
the DD and VD neurons are substantially disrupted in unc-30 null 
mutant animals. In adult unc-30(0) animals, VD MNs display ectopic 
synapses onto dorsal muscle and lack notable synaptic inputs from 
DA/DB on the dorsal side (Fig. 1b and Extended Data Fig. 1). 
Furthermore, DD MNs, which normally only form synapses onto 
ventral muscle, show aberrant innervation of dorsal muscle in L1 stage 
unc-30 mutants (Fig. 1c). These synaptic defects were confirmed with 
green fluorescent protein (GFP)-tagged RAB-3 protein, expressed 
specifically in D-type MNs (Fig. 1d). 

unc-30 is expressed in both DD and VD MNs at all stages’, yet 
unc-30 inhibits dorsal DD synapses only in the L1 stage and not at 
later stages. However, at these later stages, unc-30 does inhibit dorsal 


synapses from VD neurons, but not the DD neurons. How can the 
temporal and spatial specificity of the unc-30(e191) defects be explained? 
A potential answer to this question lies in the previously described 
mutant phenotype of two transcription factors, which recapitulate 
specific components of the cell-type specific, DD and VD synapto- 
genic defects of unc-30 mutants. In animals lacking the lin-14 tran- 
scription factor, whose expression is normally temporally restricted to 
embryonic and first larval stages in most tissues, including the D-type 
motorneurons**, DD MNs form ectopic synapses in the dorsal cord in 
embryonic and LI stages (Extended Data Fig. 2a; schematized in 
Fig. le, f)*. These DD MN defects are similar to those that we observe 
in unc-30 mutants. The dorsal ectopic synapses in the VD neurons of 
unc-30 mutant animals (not observed in lin-14) are in turn recapitu- 
lated in animals lacking the unc-55 orphan nuclear receptor, in which 
VD MNs form aberrant synapses in the dorsal cord, as previously 
shown (Extended Data Fig. 2b; schematized in Fig. le, f)°’, while DD 
wiring at the L1 stage is normal. Taken together, the unc-30 pheno- 
type in the DD and VD neurons can be viewed as a ‘composite’ of the 
two individual phenotypes of lin-14 (DD neurons at L1 stage) and 
unc-55 (VD neurons at later stages) (Fig. le, f schematic). One pos- 
sible way to explain these concordances of phenotypes is that unc-30 
may collaborate with lin-14 to control the expression of a molecule 
that acts in a temporally restricted manner in embryonic and L1 stages 
to inhibit dorsal synapse formation of the DD neurons. In the VD 
MNs, unc-30 may in turn collaborate with unc-55 to control express- 
ion of a molecule that acts in the VD neurons to inhibit dorsal synapse 
formation of these neurons. 

We sought to identify such potential effector molecule(s) through a 
candidate gene approach. In a survey of C. elegans immunoglobulin 
superfamily members, we had previously described a family of small 
proteins that are composed of a single Ig domain, the oig gene family’. 
One of the oig family members, oig-1, encodes a 137 amino acid-long 
protein with a signal sequence and a single IgC2-type domain, but no 
transmembrane domain or predicted glycosylphosphatidylinositol 
(GPI) anchor. Transgenic animals carrying an oig-1 fosmid-based 
reporter construct showed expression both in the DD and VD MNs, 
but no other ventral nerve cord MNs (Fig. 2a). Notably, expression of 
oig-1 in the D-type MNs is temporally controlled in a manner that 
correlates with the distinct periods of inhibition of dorsal muscle 
innervation exhibited by DD and VD neurons. oig-1 is transiently 
expressed in the DD neurons during the time when no dorsal synapses 
are formed (embryos and L1), but is downregulated in the DD neurons 
upon formation of their dorsal synapses (L2 and later; Fig. 2a). In 
contrast, expression of oig-1 in the VD neurons, which have processes 
but no synaptic outputs in the dorsal cord, is continuously maintained 
throughout the life of the neuron (Fig. 2a). 

The transient expression in the DD and continuous expression in 
the VD neurons makes oig-1 a candidate effector gene for the unc-30, 
lin-14 and unc-55 transcription factors. Indeed, the oig-1°°""::s12:-¢fp 
reporter fails to be expressed in both DD- and VD-type MNs in unc-30 
null mutants at all stages (Fig. 2b). In Jin-14 null mutant animals, 
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Figure 1 | Loss of unc-30 disrupts the synaptic connectivity of the DD and 
VD MNs. a, Schematic of DD rewiring’. b, Reconstruction of a VD4 MN from 
an unc-30(e191) adult animal compared to the same neuron in a wild-type 
animal. Extended Data Fig. 1 shows a more detailed presentation of the EM 
data. c, Reconstructed DD3 neuron from an unc-30(e191) L1 larva showing 
aberrant NMJs in the dorsal cord (D). Previous reconstructions of a wild-type 
L1 using the same techniques and personnel showed that a reconstructed DD3 
made no NMJs on dorsal muscles and 9 NMJs on ventral muscles’. 

d, Presynaptic marker RAB-3 ectopically localizes mostly to ventral cord in 
wild-type L1 animals (16/20 animals), but ectopically in the dorsal nerve cord 
(DNC; outlined in red) in unc-30 mutant animals (19/20). At the L4 stage, 

in which presynaptic specialization are observed in both ventral nerve cord 
(VNC; outlined in red) and DNC (19/20 animals), unc-30 mutants show 

few specializations in the VNC (20/20 animals). Original magnification, x 630. 
e, f, Summary of synapse formation defects in unc-30, lin-14, and unc-55 
mutants (e) and genetic interpretation (f). 


transient expression of the oig-1 reporter in the L1 stage is diminished 
in the DD MNs (Fig. 2b) and temporally prolonged expression of 
lin-14, achieved through genetic removal of a negative regulator of 
lin-14, the microRNA lin-4 (ref. 9), results in prolonged expression 
of oig-1 in the DD MNs into the adult stage (Fig. 2b). In animals 
lacking unc-55, which is normally expressed in VD, but not DD 
MNs", expression of oig-1 in the DD neurons at the L1 stage is un- 
affected, but oig-1 expression in the VD neurons is absent at the adult 
stage (Fig. 2b). The two distinct transcription factor combinations that 
control oig-1 expression in DD (unc-30 and lin-14) and in VD (unc-30 
and unc-55) operate independently since the expression of each tran- 
scription factor is independent of the presence of the other transcrip- 
tion factor (Extended Data Fig. 3a, b)'’. 

The transcriptional nature of oig-1 regulation is corroborated by the 
finding that 1 kilobase (kb) of 5’ sequences of oig-1 conveys the same 
spatiotemporal regulation as the oig-1 fosmid reporter (Extended Data 
Fig. 4). Chromatin immunoprecipitation-sequencing data from the 
modEncode project shows binding of UNC-55 to this 1-kb fragment 
of the oig-1 locus’, suggesting direct regulation. However, it is 
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Figure 2 | Expression of oig-1 correlates with the inhibition of dorsal 
synapse formation and is controlled by unc-30, lin-14 and unc-55. a, oig-1 
locus, oig-1 fosmid-based reporters (SL2-NLS-GFP fusion to assess gene 
regulation; GFP fusion after signal sequence to assess protein localization), 
deletion allele and oig- ¥°"""*::s12::nls::gfp expression pattern (similar results 
with 2 independent lines). b, oig-1 expression is regulated by the transcription 
factors unc-30 (0/20 animal express the reporter at any stage), lin-14 (3/20 L1 
stage animals express the reporter but at reduced levels; scored in progeny 
from lin-14 null mothers carrying a lin-14 rescue array”, n = 20), and unc-55 
(0/20 animal express the reporter in VD MNs; 20/20 animals express reporter 
in DD neurons), but not by alr-1 (20/20 animal express the reporter), in which 
VD identity but not synaptic wiring is affected”. In lin-4 animals, oig-1 
expression in DDs persists into the adult stage (20/20 adult animal express 
the reporter). Original magnification, x 630. 


conceivable that unc-55 may also control oig-1 expression indirectly, 
through the homeobox gene irx-1, a target of unc-55 (ref. 13). A 125-base 
pair element that still recapitulates spatiotemporal control in the 
D-type MN expression contains two sites with partial match to the 
UNC-30 binding site’*, and one is required for DD MN expression 
(Extended Data Fig. 4). 

The expression pattern of oig-1 and its regulation by transcription 
factors that regulate synapse formation make oig-1 a candidate for 
involvement in the synapse-organizing activity of these transcription 
factors. Animals that carry an oig-1 deletion allele (Fig. 2a) are viable 
and fertile, but display locomotory defects and hypersensitivity to the 
drug aldicarb (Extended Data Fig. 5a, b), which is characteristic of 
abnormalities in GABAergic signalling’*. In embryonic and L1 stages 
when only the DD MNs are present, the presynaptic vesicle proteins 
SNB-1 and RAB-3 are aberrantly clustered along the dorsal nerve cord 
of oig-1 null mutants (Fig. 3a; Extended Data Fig. 2c). Moreover, the 
postsynaptic GABA receptor UNC-49, which normally clusters on 
ventral muscle at the L1 stage’’, clusters ectopically along the dorsal 
nerve cord (Fig. 3b). Therefore, oig-1—like its upstream regulators 
unc-30 and lin-14—is required to prevent premature DD synapse 
formation in the dorsal nerve cord. 

Examining the synapses of the VD MN that normally exclusively 
form synapses on ventral muscle, we observed more puncta of three 
presynaptic markers (SNB-1 and RAB-3 proteins and the Liprin-« 
protein SYD-2) in the dorsal nerve cord and fewer in the ventral nerve 
cord of oig-1 mutants at post-L1 larval stages (Fig. 3c; Extended Data 
Fig. 2c). This indicates that the VD MNs have aberrant synaptic spe- 
cializations in the dorsal nerve cord in oig-1 mutants; although less 
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Figure 3 | Aberrant D-type MN synapse formation in oig-1 mutants. a, 
Ectopic DD synapses in the dorsal nerve cord of oig-1 mutant L1s. b, Ectopic 
UNC-49 localization in oig-1 mutant L1s, as assessed by UNC-49 antibody 
staining. UNC-17 staining was used as a control to identify the ventral and 
dorsal nerve cords of the animals (not shown). ¢, Ectopic VD synapses in the 
dorsal nerve cord of L4 staged oig-1 mutants. For each marker in each 

strain, puncta in the posterior of L4 animals from DD4 to VD11 were scored. 
n > 20 for each experiment, **P < 0.01, *P < 0.05, original magnification, 
X630. 


severe, this phenotype is similar to those of the VD synaptic defects of 
unc-30 and unc-55 mutants. 

We next examined whether oig-1 function is restricted to control- 
ling the synaptic output of D-type neurons or whether oig-1 may also 
affect localization of their synaptic input. Innervation of the DD and 
VD neurons from the cholinergic A- and B-type neurons can be 
visualized with GFP-tagged ACR-12 protein’? which localizes to 
puncta in the DD neurons in the dorsal nerve cord in the LI stage, 
indicative of the cholinergic input from the DA/DB MNs (Fig. 4a). 
Remarkably, in oig-1 mutants, these dorsal puncta are not observed 
(Fig. 4a). In post L1-stage wild-type animals, ACR-12 protein normally 
labels synapses from DA/DB to the VD neurons in the dorsal cord and 
synapses from VA/VB to the DD neurons in the ventral cord (Fig. 4b). 
In oig-1 mutants, the dorsal, ACR-12(+) synaptic inputs in the VD 
neurons also do not form properly (Fig. 4b). The coincidence of syn- 
aptic input and synaptic output defects in oig-1 mutants indicates that 
the localization of synaptic inputs and outputs are coordinated, and that 
this coordination requires the OIG-1 protein. As expected, the oig-1 
defects in synaptic innervation, as determined by ACR-12 clustering, 
are mirrored by loss of the temporal (lin-14) and spatial (unc-55) 
specificity regulators of oig-1 expression (Extended Data Fig. 6a, b). 

The synaptic defects (as well as the locomotory defects) of oig-1 
mutants can be rescued by expressing oig-1 specifically in the D-type 
MN under control of the unc-30 promoter, whereas expression under 
control of a cholinergic A- and B-type MN promoter does not rescue 
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mutant background”, in L1 stage animals (a) and L4 stage animals (b). n > 20 
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(Fig. 3a, c and Extended Data Fig. 5a, c). Since OIG-1 is predicted to 
encode a secreted protein, the lack of rescue of oig-1 with a cholinergic 
promoter suggests that OIG-1 protein functions cell-autonomously in/ 
on the GABAergic DD and VD classes of MNs and argues against a 
long-range, diffusible function of OIG-1. Consistent with this auto- 
nomy, oig-1 mutants show no defects in the localization of synapses of 
the adjacent cholinergic MNs (data not shown). The rescuing activity 
of oig-1 critically depends on the integrity of the IgC2 domain 
(Extended Data Fig. 7). Forced expression of oig-1 in the D-type 
MNs under control of two promoters that are not downregulated in 
the D-type motor neurons (unc-25 and unc-30 promoters) is not suf- 
ficient to prevent the formation of dorsal synaptic outputs of the DD 
neurons (data not shown), indicating that oig-1 collaborates with other 
factors to regulate synapse formation. 

The synaptic wiring defects in oig-1 mutants suggested that the 
OIG-1 protein might be localized in a spatially restricted manner in 
the DD and VD MNs. A fosmid-based reporter in which OIG-1 pro- 
tein is fused to GFP (and which rescues oig-1 locomotory defects and 
the aldicarb hypersensitivity; Fig. 2a and Extended Data Fig. 5a, b) 
shows punctate localization along the processes of the D-type MNs 
(Fig. 5a). The punctate pattern of OIG-1 in D-type MNs revealed a 
surprising localization pattern along the D-type processes. At the L1 
stage, OIG-1 is not localized along the dorsal processes (in which the 
synaptic wiring defects are observed), but is localized along the ventral 
cord. After the generation of the VD MNs (and extinction of OIG-1 
expression in the DD MNs, as described above), OIG-1 protein also 
localized in the VD neurons along the ventral cord. Co-labelling with 
the presynaptic RIM protein UNC-10 and the postsynaptic GABA 
receptor UNC-49 demonstrates that these puncta correspond to the 
perisynaptic region of synapses that D-type MNs form onto ventral 
muscle (Fig. 5b). When ectopically expressed in excitatory cholinergic 
VNC or head MNs, OIG-1-GFP is also targeted to synaptic specializa- 
tions (Fig. 5c), demonstrating that OIG-1 localization is not dependent 
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on GABAergic-specific synaptic features, but rather contains synaptic 
targeting properties that are independent of the type of synapse. Taken 
together, the perisynaptic localization pattern of OIG-1 indicates that a 
highly localized, synaptic organizer protein is capable of orchestrating 
the wiring properties of an entire neuron, by promoting synaptic input 
and preventing ectopic synaptic output in a distal portion of the neur- 
onal process (Fig. 5b). Punctate OIG-1 protein is also observed in a few 
head neurons (Extended Data Fig. 8). 

The mutant phenotype of oig-1, specifically the aberrant formation 
of synaptic output from dorsal DD and VD axons, resembles the 
mutant phenotype observed upon removal of the SAD-1 kinase’*””. 
In animals lacking sad-1 or lacking strd-1/STRADa, a pseudokinase 
required for SAD-1 localization’, OIG-1 clusters ectopically along the 
dorsal nerve cord (Extended Data Fig. 9). Conversely, loss of oig- 1 does 
not affect localization of SAD-1 (data not shown). What sets OIG-1 
apart from these molecules is that, in contrast to the pan-neuronally 
expressed SAD-1 and STRADa, OIG-1 seems to operate as a spatio- 
temporally controlled nexus of this pathway that determines the spa- 
tiotemporal specificity of SAD-1 protein function in the context of the 
D-type MNs. The ability of ventrally and presynaptically localized 
OIG-1 to organize distally located synaptic inputs and outputs on 
the dorsal neurite suggests that OIG-1 may trigger a cascade of down- 
stream signalling events or anchor factors on the ventral side which 
would otherwise contribute to synapse organization on the dorsal side. 

In conclusion, we have shown here that three different transcription 
factors cooperate in an intersectional manner in defined spatial and 
temporal contexts to control the expression of a perisynaptically loca- 
lized organizer molecule, OIG-1, which orchestrates the localization of 
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synaptic outputs and inputs of two different neuron types (Fig. 5d). 
unc-30 needs to cooperate with other transcription factors and these 
collaborators confer spatiotemporal specificity. In embryonic and L1 
stages, spatially but not temporally restricted unc-30 cooperates with 
temporally, but not spatially controlled, lin-14 to prevent DD MN 
synapse assembly at the inappropriate location via induction of oig-1 
expression. After the L1 stage, DD/VD-expressed unc-30 collaborates 
with the subtype (VD)-specific unc-55 transcription factor to restrict 
oig-1 expression to the VD neurons where it organizes synaptic inputs 
and outputs (Fig. 5d). 

Our findings demonstrate that the localization of synaptic inputs 
and outputs of a neuron are coordinated and that this coordination is 
apparently achieved, at least in part, by the OIG-1 organizer protein. It 
will be interesting to examine whether similar synaptic organizer func- 
tions can be ascribed to any of the multiple small, single immuno- 
globulin domain proteins, some secreted, some transmembrane, 
which are encoded in the C. elegans”! and vertebrate genomes”. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 

C. elegans strains. Worms were grown at 20°C on nematode growth media 
(NGM) plates seeded with bacteria (Escherichia coli OP50) as a food source. L1 
animals were obtained by hypochlorite-treating gravid adult animals and letting 
embryos hatch and arrest in M9 for 16-18 h. 

Mutant alleles used in this study: LGI: unc-55(e1170), LGU: lin-4(e912), LGIII: 

oig-1(0k 1687), strd- 1(0k2283), LGIV: unc-30(191), LGX: lin-14(ma135), alr-1(oy42), 
sad-1(ky289), acr-12(0k367) 
Transgenes. of£x5663 (unc-30°*"?"°":-ofp::rab-3::unc-10° “"®) used in Fig. 1, 
Extended Data Fig. 2, ot£x4816 (oig- !°°""4::s12::¢fp), ot1s450 (oig- ¥°°"4::s12::¢fp) 
used in Fig. 2, ofEx5651 (unc-30°*"?""::gfp) used in Extended Data Fig. 3a, 
wels395 (une-30/""4:: TY 1::efp::3xFLAG) used in Extended Data Fig. 3a, of£x5765 
(lin-14°°"4:-gfp) used in Extended Data Fig. 3b, juIs1 (unc-25p::snb-1::gfp)* used in 
Fig. 3a, c, Extended Data Figs 5c and 7, ot Ex4955 (unc-307 4K? ”™-:0ig-1 line 1) 
used in Fig. 3a, c and Extended Data Fig. 5a, c, otEx4956 (unc-307- 4? ™oig-1 
line 2) used in Fig. 3a, c, otEx4941 (unc-3?::oig-1 line 1) used in Fig. 3a, c and 
Extended Data Fig. 5c, otEx4942 (unc-3?""::0ig-1 line 2) used in Fig. 3a, c, hpIs3 
(unc-25p::syd-2::gfp)* used in Fig. 3c, ufls92 (unc-47°"™::acr-12::gfp)'” used 
in Fig. 4 and Extended Data Fig. 6, of£&x5664 (oig- l/°""*:-gfp) used in Fig. 5a, b 
and Extended Data Figs 5a, b, 8 and 9, otEx5858 (unc-3°°SePP ""OIG-1::gfp) 
used in Fig. 5c, otEx5859 (del-1*8%PP'™::oig-1::¢fp) used in Fig. 5c, otEx6212 
(unc-30°*?P""'::oig-1E64A line 1), otEx6213 (unc-30°**?""::0ig-1E64A line 2), 
otEx6214 (unc-3074*?""".:oig-1W75A line 1) and otEx6215 (unc-307**"O":; 
oig-1W75A line 2) all used in Extended Data Fig. 7. 

oig-1 cis-regulatory analysis constructs (all used in Extended Data Fig. 
otEx5993 (oig-1?"""::NLS::gfp line 1), otEx5994 (oig-1?°"!::NLS::gfp line 
ot Ex5995 (oig-1°""::NLS::gfp line 3), otEx5996 (oig-1?"”"::NLS::gfp line 
otEx5997 (oig-1?"”"?::NLS::gfp line 2), otEx5998 (oig-1?”?::NLS::gfp line 
otEx6003 (oig-1°”"::NLS::gfp line 1), otEx6004 (oig-1?"”"*::NLS::gfp line 
otEx6005 (oig-1°”"::NLS::gfp line 3), otEx6006 (oig-1?""*::NLS::¢fp line 
otEx6007 (oig-1°"”"*::NLS::gfp line 2), otEx6008 (oig-1?""*::NLS::gfp line 
otEx6009 (oig-1°”*::NLS::gfp line 1), otEx6010 (oig-1?""*::NLS::gfp line 
otEx6011 (oig-1°”*::NLS::gfp line 3), otEx6034 (oig-1?"”"*::NLS::gfp line 
otEx6035 (oig-1°"*::NLS::gfp line 2), otEx6036 (oig-1?”"*::NLS::efp line 
otEx6037 (oig-1°”"”::NLS::gfp line 1), otEx6038 (oig-I?”"’::NLS::gfp line 
otEx6039 (oig-1°”"”::NLS::gfp line 3), otEx6060 (oig-1?""*::NLS::gfp line 
otEx6061 (oig-1?""®::NLS::gfp line 2), otEx6121 (oig-1?"”?::NLS::gfp line 
otEx6122 (oig-1°"°"”::NLS::gfp line2), otEx6079 (oig-1°"°""::NLS::gfp line 
ot Ex6080 (oig-1?"”""°::NLS::¢fp line 2), otEx6081 (oig-1?"”""°::NLS::gfp line 3). 
Generation of oig-1 transgenes. The oig- 1/°°”"::s12::gfp reporter (shown in Fig. 2) 
was generated by fosmid recombineering using the fosmid WRM0614cC07 and a 
SL2-based, nuclear-localized gfp cassette, pBALU9”’. The inclusion of the SL2 
sequence results in the production of nuclear localized GFP. The reporter was 
injected at 10 ng yl! with rol-6(su 1006) at 2 ng pl ' and sonicated OP50 genomic 
DNA at 120 ng pl '. An extrachromosomal array (otEx4816) was integrated to 
yield otIs450 IV. 

The oig-1 translational fosmid gfp reporter, oig- /°*”"::gfp (shown in Fig. 5) was 
generated by fosmid recombineering using the fosmid WRM0614cC07 and a gfp 
cassette pBALU25, modified from pBALU1””. pBALU25 was created by mutating 
the coding sequence of gfp in pBALU]1 to contain the amino acid changes F64L 
and S65T. This cassette was recombineered and inserted into the oig-1 fosmid 
immediately following the predicted signal peptide sequence (after the 72nd 
base pair of oig-1). This translational reporter was injected at 10 ng pl’ with 
tix-3?"”::mCherry at 3 ng pl’ and sonicated OP50 genomic DNA at 120 ng pl’. 

The unc-3°°??""':.ig-1::gfp construct was generated by TOPO cloning a PCR 
fusion of 558 bp upstream of the unc-3 translational start site with a fragment of the 
oig-P°*"".:9f construct, containing from the oig-1 translational start site to 500 bp 
downstream of the stop codon. This construct was PCR-amplified from the start of the 
unc-3 promoter to 500 bp downstream of the oig-1 locus and injected at 10 ng pl’ with 
ttx-3?"":: Cherry at 3 ng pl”! and sonicated OP50 genomic DNA at 120 ng pl. 

The del-1#%"??"°™.-oig-1::efp construct was generated by TOPO cloning as the 
unc-3°°8PP'™.-oig- 1:-efp with 488 bp upstream of the del- translational start site. 
This construct was PCR-amplified from the start of the del-1 promoter to 500 bp 
downstream of the oig-1 locus and injected at 10 ng pl * with ttx-3?""::mCherry at 
3 ng ul * and sonicated OP50 genomic DNA at 120 ng pl. 

The unc-30p::oig-1 construct was generated by cloning the 2.4 kb unc-30 pro- 
moter into the EcoRV site of the first MCS of pPD49.26 and the oig-1 locus 
from the start to stop codon (1,521 bp) into the BamHI site of the second MCS. 
This construct was digested with Pvul and injected at 5 ng pl’ with myo-2p::gfp at 
3 ng pl’ and OP50 genomic DNA at 120 ng ll’. Site-directed mutagenesis of 
this construct was used to generate unc-30p::0ig-1E64A and unc-30p::0ig-1W75A. 
These constructs were injected as described above. 
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The unc-3p::oig-1 construct was generated by cloning the 558 bp unc-3 pro- 
moter into the EcoRV site of the first MCS of pPD49.26 and the oig-1 locus from 
the start to stop codon (1,521 bp) into the BamHI site of the second MCS. This 
construct was digested with Pvul and injected at 10 ng pl’ with myo-2p::gfp at 
3 ng pl | and OP50 genomic DNA at 120 ng pl". 

The oig-1 promoter deletion constructs were generated by cloning the various 
promoter fragments into the HindIII and BamHI sites of the MSC of a2XNLSGFP 
plasmid. Promoter constructs with potential UNC-30 binding sites deleted 
were generated using site-directed mutagenesis. These constructs were injected 
at 50 ng pl! with rol-6 at 30 ng pl! and rol-6(su1006) at 20 ng pl. 
Wormtracker assays. Tracking assays were performed as previously described”*. 
Briefly, L4 animals were placed on an NGM plate seeded with 20 pl of OP50 
bacteria in the centre. Automated tracking was performed with the Worm 
Tracker 2.0 (WT2) which uses a camera to track and record individual worms. 
Twenty worms of each genotype were tracked for 5 min each at 20 °C. Analysis was 
performed as previously described”. 

Aldicarb assays. Aldicarb assays were performed as previously described’. 
Briefly, ~20 young adult animals (24 h after L4 stage, blinded for genotype) were 
picked for freshly seeded NGM plates containing 1 mM aldicarb (ChemService). 
Worms were assayed for paralysis every 15 min by prodding with a platinum wire. 
A worm was considered paralyzed if it did not respond to prodding to the head and 
tail three times each at a given time point. Strains were grown and assayed at 20 °C. 
Antibody staining. Antibody staining was performed as previously described”. 
Briefly, following a freeze-crack procedure, worms were fixed by a treatment in ice- 
cold acetone for 5 min and then ice-cold methanol for 5 min. Worms were 
collected in 1X PBS and centrifuged briefly. The PBS was removed and worms 
were incubated in a blocking solution (1 PBS, 0.2% gelatin, 0.25% Triton X-100) 
for 30 min at 20 °C. After the blocking solution was removed, worms were incu- 
bated with primary antibodies diluted in PGT (1X PBS, 0.1% gelatin, 0.25% Triton 
X-100) overnight at 4°C. The anti- UNC-49 antibody™ was used at a 1:500 dilu- 
tion. The anti- UNC-17 antibody*’ was used at a 1:3,000 dilution. The anti-GFP 
antibody (Life Technologies A10262) was used at 1:1,000. The anti-RIM2 (used to 
recognize UNC-10) was used ata 1:10 dilution (Developmental Studies Hybridoma 
Bank - University of lowa). Worms were washed 5 times in wash solution (1 PBS, 
0.25% Triton X-100) for 20 min each wash. Worms were then incubated with 
secondary antibodies diluted 1:1,000 in PGT for 3 h at 20 °C. Alexa Fluor 488 goat 
anti-chicken (Invitrogen A11039) was used to detect the anti-GFP antibody. Alexa 
Fluor 594 donkey anti-mouse (Invitrogen A-21203) was used to detect the anti- 
RIM2 antibody. Alexa Fluor 555 donkey anti-rabbit (Invitrogen A-31572) was used 
to detect the anti-UNC-49 antibody. Alexa Fluor 488 donkey anti-mouse 
(Invitrogen A-21202) was used to detect the anti- UNC-17 antibody. Worms were 
then washed 5 times for 20 min each wash. Following the final wash, worms were 
mounted in Fluorogel with Tris buffer (Electron Microscopy Sciences). 

Statistical analysis. For results shown in Figs 3a, b, 4a, b, 6a, c, 7 and 9, we 
performed Fisher’s exact test, **P < 0.01, *P < 0.05. For results shown in 
Fig. 3c, Extended Data Fig. 2c, we performed a Student’s t-test (2 sided, type 2), 
**P < 0.01, *P < 0.05. For WormTracker analysis in Extended Data Fig. 3a, we 
used Wilcoxon rank sum to test the differences between oig-1, wild-type, and 
rescued strains, **q < 0.01, *q < 0.05. No statistical methods were used to 
predetermine sample size, and the experiments were not randomized. 
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Extended Data Figure 1 | Electron microscopical analysis of unc-30(e191) 
mutants. a, Reconstructions of a VD4 and a DD2 MN from an unc-30(e191) 
animal compared to the same neurons in a wild-type animal. Cell bodies 
(large black dots) are all situated in the ventral cord. Processes emanate 
anteriorly (upwards on plots) from the cell body and run along the ventral cord. 
Lateral branches leave the ventral cord (V) and run round to the dorsal cord (D) 
as a circumferential commissure (broken horizontal process in the plots). 
Commissures from unc-30 mutant type-D neurons are situated in the same 
regions as those of their wild-type counterparts. However, the cell bodies of DD 
neurons are often displaced anteriorly in the mutants, with the consequence 
that DD neurons have shorter processes in the ventral cord (C). Processes in the 
dorsal cord run anteriorly in mutant animals, whereas they branch with the 
main branch running posteriorly in wild-type animals. Neuromuscular 
junctions (NMJs) in unc-30 mutants are made predominantly in the dorsal cord 
by both the DDs and VDs, whereas in the wild type, only DD neurons innervate 
dorsal muscles. The synaptic inputs to the DD and VD neurons in mutant 
animals (inward pointing arrows for chemical synapses and “T” for gap 
junctions) are generally abnormal. The reconstructed DD2 neuron received 
synapses from several unidentified processes on the ventral side (depicted with 
a ‘?’). These processes do not belong to VA or VB neurons, the normal pre- 
synaptic partners of DD, as all the local VA and VB neurons were identified. 
From the location and synaptic behaviour of these processes, it is probable that 
they belong to interneurons which span the length of the cord and do not 
usually innervate D-type neurons. An asterisk (“*’) indicates that a synapse has 


multiple post-synaptic elements. A total of six VDs and three DDs were 
reconstructed. Each reconstruction covered around 2,000 electron microscopy 
sections, corresponding to a length of 100 jm along the body of the animal. The 
unc-30(e191) mutation does not affect MN cell body position or the synaptic 
behaviour of the DA/DB neurons, except in regard to their synapses to D-type 
neurones; this made it possible to unambiguously identify D type neurones 
from their positions and by eliminating other identified classes of MNs. 
Electron microscopy and reconstructions of micrographs of serial sections were 
performed as described in ref. 31. b, Electron micrographs. The processes of DD 
and VD neurons normally run subjacent to the bounding basal lamina of the 
ventral cord immediately dorsal to the axons of the VA and VB neurons (a). 
NMJs are made through the basal lamina onto muscle arms (M). In unc- 
30(e191) animals, the axons of the DD and VD neurons wander round the cord 
and do not run in defined locations; the configuration shown in b is typical 
but not stereotyped. Very few NMJs are made in the ventral cord by the DD 
or VD neurons in unc-30 mutants; those that are made look rather small (c). 
Atypical synapses (d) are often made onto DD or VD processes from neurons 
such as the touch receptor neuron AVM (6). It is probable that these synapses 
are not normally found as the processes of AVM, and DD or VD do not 
normally run alongside each other in wild-type animals. Both the DD and VD 
MNs make NMJs to dorsal muscles in the dorsal cord of unc-30 mutants, 
whereas in wild-type animals, only DD neurons do so and VD neurons 
innervate ventral muscles in the ventral cord. e, f, NMJs made by VD (e) and 
DD (f) neurons in the dorsal cord of an unc-30(e191) mutant. Scale bars, 1 jum. 
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Extended Data Figure 2 | RAB-3 is ectopically localized in lin-14, unc-55, 
and oig-1 mutants. a, Presynaptic marker RAB-3 ectopically localizes to the 
dorsal nerve cord (DNC; marked in red) in lin-14 mutant animals. RAB-3-—GFP 
puncta (from otEx5663, unc-30p::gfp::rab-3) localize mostly to the ventral nerve 
cord (VNC; marked in red) in wild-type L1 animals (left). Ectopic RAB-3-GFP 
puncta localize mostly to the dorsal nerve cord in 95% of lin-14 L1 animals 
(right, scored in progeny from lin-14 null animals carrying a lin-14 rescue 
array~’). Ventral and dorsal nerve cords are indicated by red dotted lines. L1 
animals were obtained by hypochlorite-treating gravid adult animals and 
letting embryos hatch and arrest in M9 for 16-18 h. n > 20 for each strain 
scored. b, RAB-3 ectopically localizes to the dorsal nerve cord in unc-55 L4 
mutant animals. RAB-3-GFP puncta localizes to both the ventral (VNC) and 
dorsal (DNC) nerve cord in 100% of wild-type L4 animals (left). RAB-3-GFP 
puncta localize mostly to the dorsal nerve cord in 100% of unc-55 L4 


lin-14 (ma135)(L1 stage) 


_—<——————_ 
Wild type oig-1 
DNC 


animals (right). Ventral and dorsal nerve cords are indicated by red dotted 
lines. Signals between the nerve cords are autofluorescence from the gut. n > 20 
for each strain scored. c, RAB-3 ectopically localizes to the dorsal nerve cord in 
oig-1 mutants. RAB-3 normally localizes to the ventral nerve cord (VNC, 
marked in red) in wild-type L1 animals (top left). Ectopic RAB-3-GFP puncta 
localize to the dorsal nerve cord in 55% of oig-1 L1 animals (top right, compared 
to 20% of wild-type animals). L1 animals were obtained by hypochlorite- 
treating gravid adult animals and letting embryos hatch and arrest in M9 

for 16-18 h. n > 20 for each strain scored. In wild-type L4 animals, more 
RAB-3-GFP puncta are localized in the VNC than in the DNC of the animal 
(bottom, black dots). Conversely, in oig-1 mutants, more RAB-3-GFP 
puncta are localized in the DNC than in the VNC (bottom, red dots). 

**P < 0.01, n = 20 for each strain, original magnification, x 630. 
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Extended Data Figure 3 | Mutual independence of transcription factor 
activities. a, Expression of unc-30 is not affected by loss of lin-14 or unc-55. A 
2.4 kb unc-30 promoter gfp fusion reporter is expressed in the DD MNs (green 
circles) in wild-type L1 animals; this expression is not affected in lin-14(—) 
mutant animals (scored in progeny from lin-14 null mothers carrying a lin-14 
rescue array~’). An unc-30 fosmid-based reporter, kindly provided by the 
TransgeneOme project, is expressed in the DD (green circles) and VD (blue 
squares) MNs (DD4 to VD10 shown) in wild-type L4 animals; this expression 


lin-14 fosmid :-gfp expression 


-_: 
oe 


is not affected in unc-55(e1170) L4 animals. n > 20 for each genotype. 

b, Expression of a lin-14 fosmid-based reporter construct” is unaffected by 
loss of unc-30. lin-14 is expressed in the DA, DB, and DD MNs in the VNC at 
the L1 stage (average number of VNC cells = 15); this expression is not affected 
in unc-30(e191) mutant L1s (average number of VNC cells = 15); n > 20 
for each genotype. Loss of unc-30 also does not affect unc-55 expression, as 
shown by ref. 11. Original magnification, 630. 
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Extended Data Figure 4 | Deletion of a putative UNC-30 binding site results 
in loss of oig-1 expression in the D-type neurons. Regions of the oig-1 
promoter were fused to gfp to analyse expression. (+) indicates robust 
expression of the reporter construct in the specified cell type, whereas (—) 
indicates loss of expression in the specified cell type. Twenty worms at both the 
L1 and L4 stage were scored for each line. Expression of a 1 kb promoter 
reporter (prom 1) recapitulates expression of the oig- #°*”"“::gfp reporter in the 
D-type MNs (see Fig. 2). This region contained 3 elements that exactly match 
the UNC-30 consensus binding site (TAATC, purple box",) and multiple 
others that are a partial match to the UNC-30 binding site (magenta and blue 
boxes). Further deletion of this prom 1 defined a minimal 125 bp element 
that is sufficient to drive oig-1 expression in the D-type MNs (prom 6). This 


lines 
observed 


other 
neurons 


element contains two sites that partially match the UNC-30 binding sequence. 
Deletion of the AAATC site in the context of the 1 kb promoter (prom 9) has 
no effect on oig-1 expression in the D-type MNs. Deletion of the TAAAC 
site in the context of the 1 kb promoter reporter (prom 10) results in complete 
loss of oig-1 expression specifically in the D-type MNs. We noted that some of 
the smaller transcriptional reporter lines show extended expression of gfp in 
the DD motor neurons. Since DD rewiring is delayed upon partial removal 
of the homeobox gene irx-1 (ref. 13), it is possible that in these reporters, 
potential IRX-1 binding sites are deleted. We have not pursued the effect of 
irx-1 on oig-1 expression as the lethality associated with complete loss of irx-1 
function complicates an analysis of irx-1 null mutant phenotypes in D-type 
motor neurons. 
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Extended Data Figure 5 | oig-1 mutants defects and their rescue. a, oig-1 
mutants display locomotory defects. Locomotion of L4 animals was analysed 
with tracking assays”*. The graphs on the left side of each panel correspond to 
assays comparing wild-type, oig-1 mutant, and oig-1; unc-30p::oig-1 animals. 
The graphs on the right side of each panel correspond to assays comparing 
wild-type, oig-1 mutant, and oig-1; oig-¥°*""“::gfp animals. Twenty animals 
(each dot on a plot) were tracked for each genotype for both comparisons. 
Mean and Q values are indicated. Note that in a previously published analysis 
of a large panel of available mutants, the same set of locomotory defects that 
we describe here for oig-1 mutants were found to be affected in unc-55 
mutants”, albeit in a stronger manner than oig-1 mutants. Also note that the 
very strong locomotory defects unc-30 defects are qualitatively very different 
from oig-1 mutants, but this is to be expected as unc-30 mutants do not only 
show the synaptic defects that we describe here, but also lack the neuro- 
transmitter GABA (ref. 3), thereby disabling any neuromuscular signalling. Top, 
The midbody speed of oig-1 mutant animals is significantly lower than that of 
wild-type animals. This defect is partially rescued (statistically different from 
oig-1 mutants but also from wild-type animals) by expressing unc-30p::oig-1 in 
oig-1 animals (left graph). The lower midbody speed of oig-1 mutants is 
completely rescued (statistically different from oig-1 mutants but not from 
wild-type animals) by expressing the oig-/*”"“::¢fp in oig-1 mutants. Middle, 
oig-1 mutants exhibit more dwelling than wild-type L4 animals. This defect 
is partially rescued (statistically different from oig-1 mutants but also from 
wild-type animals) by expressing unc-30p::oig-1 in oig-1 animals (left graph). 
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The increased dwelling of oig-1 mutants is completely rescued (statistically 
different from oig-1 mutants but from not wild-type animals) by expressing the 
oig- ¥°*"".-gfp in oig-1 mutants. Bottom, oig-1 mutants exhibit an increased 
path curvature compared to wild-type animals. This defect is not rescued by 
expressing unc-30p::oig-1 in oig-1 animals (left graph). The increased path 
curvature of oig-1 mutants is completely rescued (statistically different from 
oig-1 mutants but not from wild-type animals) by expressing the oig- #°°"::gfp 
in oig-1 mutants. b, Aldicarb-sensitivity defects in oig-1 mutants. oig-1 
mutant young adult animals (red squares), which display aberrant GABAergic 
synapses in both the ventral and dorsal cord, show hypersensitivity to aldicarb- 
induced paralysis compared to wild type (black triangles). Expression of 
oig-1°*”"4:-ef from a multicopy transgenic array (green circles) does not only 
rescue the oig-1 mutant phenotype, but even results in a slight hyposensitivity 
to aldicarb. Worms (blinded for genotype) were tested every 15 min for 
paralysis by touching the head and tail three times each. n = 20 for each strain, 
repeated 3 times. c, Expression of oig-1 in the D-type neurons rescues ectopic 
DD synapses in the dorsal nerve cord. At the L1 stage when only the DD MNs 
are present, SNB-1-GFP (from juls1-unc-25p::snb-1::gfp) localizes to the 
ventral nerve cord (VNC) in wild-type animals (top left). Ectopic SNB-1-GFP 
puncta localize to the dorsal nerve cord (DNC) of oig-1 mutant L1s (top 
right). This phenotype is rescued by expressing oig-1 in the D-type MNs 
(unc-30p::0ig-1, otEx4955, bottom left), but not by expressing oig-1 in the 
neighbouring cholinergic MNs (unc-3p::oig-1, otEx4942, bottom right). 
Original magnification, 630; white boxes indicate the dorsal nerve cord. 
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Extended Data Figure 6 | ACR-12 is mislocalized in lin-14 and unc-55 
mutants. Cholinergic innervation to the D-type MNs is visualized with an unc- 
47p::acr-12::gfp reporter transgene, maintained in an acr-12(0k367) mutant 
background””. a, ACR-12 puncta localization is affected by loss of lin-14. In 
wild-type L1 animals, ACR-12 puncta are observed only in the DD neurons in 
the dorsal nerve cord (DNC) (left). In Jin-14 mutant L1 animals (scored in 
progeny from lin-14 null animals carrying a lin-14 rescue array’), ACR-12 
puncta are detected in the ventral nerve cord (VNC) of the DD MNs. 


Quantification of this data are represented in the graph. Some dorsal puncta in 
the DD MNs were still observed in 83% of the lin-14 mutant L1s that had 
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puncta in the ventral nerve cord. L1 animals were obtained by hypochlorite- 
treating gravid adult animals and letting embryos hatch and arrest in M9 for 
16-18 h. n > 20 for each strain scored, **P < 0.01. b, ACR-12 puncta 
localization is affected by loss of unc-55. In wild-type L4 animals, ACR-12 
puncta are observed in both the ventral (VNC) and dorsal (DNC) nerve cords 
(left). In unc-55 mutant L4 animals, ACR-12 puncta are observed mostly in 
the ventral nerve cord of unc-55 mutants. Ventral and dorsal nerve cords are 
marked by red dotted lines. Quantification of this data are represented in 

the graph. n > 20 for each strain, **P < 0.01, original magnification, x 630. 
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Extended Data Figure 7 | The IgC2 domain is necessary for OIG-1 function. 
At the L1 stage when only the DD MNsare present, ectopic SNB-1-GFP puncta 
(from juls1-unc-25p::SNB-1::GFP) localize to the dorsal nerve cord (DNC) of 
oig-1 mutant L1s (red bar) but not in wild-type animal (black bar) (see Fig. 3a). 
Based on an alignment with the hidden Markov model (HMM) Ig domain 
(top), a highly conserved residue (W75) and a nonconserved residue (E64) in 
the OIG-1 Ig domain were mutated in the context of an unc-30p::0ig-1 


transgene that is able to rescue the L1 ectopic synapse defects (see Fig. 3a). 
The unc-30p::oig-1E64A transgenes (otEx6212, otEx6213) were still able to 
rescue the synaptic defects of oig-1 L1 animals (green bars), whereas the 
unc-30p::oig-IW75A transgenes (otEx6214,0tEx6215) had no rescue ability 
(blue bars). L1 animals were obtained by hypochlorite-treating gravid adult 
animals and letting embryos hatch and arrest in M9 for 16-18 h. n > 20 for 
each strain scored, **P < 0.01, *P < 0.05. 
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Extended Data Figure 8 | OIG-1 localization in other neuron types. OIG-1- _ neurons form synapses onto pharyngeal muscles along their processes, and 
GFP (from oig- #°*”"*::¢fp) localizes in a punctate manner along axons in the _ these processes also show punctate localization of OIG-1. Original 

nerve ring (blue arrow) and along a pair of neurons in the pharynx, tentatively magnification, <630. 

identified as the M2 MNs (red arrows point to cell body and process). These 
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Extended Data Figure 9 | OIG-1 is mislocalized in sad-1 and strd-1 mutants. | OIG-1-GFP is ectopically localized to the dorsal side of L1 animals. 
OIG-1-GEP (from oig-/**”"“::gfp) localizes to the ventral nerve cord (VNC) of | Quantification of the data are shown in graph. n > 20 for each strain 
wild-type L1 animals (left). In sad-1 mutants (middle), OIG-1-GFP is scored, **P < 0.01, original magnification, x 630. 

ectopically localized to the dorsal side (DNC) of L1 animals. In strd-1 mutants, 
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Cell-intrinsic adaptation of lipid composition to local 
crowding drives social behaviour 


Mathieu Frechin!, Thomas Stoeger’, Stephan Daetwyler’, Charlotte Gehin*, Nico Battich’?, Eva-Maria Damm‘, Lilli Stergiou’, 


Howard Riezman? & Lucas Pelkmans! 


Cells sense the context in which they grow to adapt their phenotype 
and allow multicellular patterning by mechanisms of autocrine 
and paracrine signalling’. However, patterns also form in cell 
populations exposed to the same signalling molecules and sub- 
stratum, which often correlate with specific features of the popu- 
lation context of single cells, such as local cell crowding®. Here we 
reveal a cell-intrinsic molecular mechanism that allows multicel- 
lular patterning without requiring specific communication 
between cells. It acts by sensing the local crowding of a single cell 
through its ability to spread and activate focal adhesion kinase 
(FAK, also known as PTK2), resulting in adaptation of genes 
controlling membrane homeostasis. In cells experiencing low 
crowding, FAK suppresses transcription of the ABC transporter 
Al (ABCA1) by inhibiting FOXO3 and TAL1. Agent-based com- 
putational modelling and experimental confirmation identified 
membrane-based signalling and feedback control as crucial for 
the emergence of population patterns of ABCA1 expression, which 
adapts membrane lipid composition to cell crowding and affects 
multiple signalling activities, including the suppression of ABCA1 
expression itself. The simple design of this cell-intrinsic system and 
its broad impact on the signalling state of mammalian single cells 
suggests a fundamental role for a tunable membrane lipid com- 
position in collective cell behaviour. 

Adherent tissue culture cells spread out their cell surface more when 
experiencing low local crowding than high local crowding, resulting in 
a higher number of focal adhesions, sites of cellular attachment to the 
extracellular matrix (ECM), and higher levels of activated FAK 
(Extended Data Fig. 1a). FAK is recruited to focal adhesions, where 
it undergoes autophosphorylation, and subsequently recruits and 
phosphorylates phosphatidylinositol-3-OH kinase (PI(3)K) and many 
other proteins involved in signalling, cell adhesion and cytoskeletal 
dynamics**. FAK may thus, in a cell-intrinsic manner, sense local cell 
crowding by reacting to the available space and mechanical constraints 
imposed during cell population growth’*, and signal this to down- 
stream cellular functions. To test this, we compared the extent of 
adaptation of the transcriptome to cellular crowding in adherent 
embryonic fibroblasts from a FAK-knockout mouse (FAK-KO) with 
cells from the same background in which FAK was stably re-expressed 
(FAK-rescue). 

A total of 1,014 genes (~5% of the whole genome) adapt their 
transcript abundance to cellular crowding, of which 80% required 
the presence of FAK to adapt (Fig. la). Although FAK induces genes 
related to cell growth and proliferation (Extended Data Fig. 1b), it 
suppresses genes involved in membrane and organelle homeostasis 
(Fig. 1b) in cells experiencing low crowding, amongst which are 4 
ATP-binding cassette (ABC) transporters (Abcal, Abca6, Abca9 and 
Abcg2) (Extended Data Fig. 1c). Abcal was the overall second most 
strongly suppressed (~14-fold) gene by FAK (Fig. 1a) and the stron- 
gest hit amongst all genes in functional annotation terms related 
to membrane organization (Fig. 1b). ABC transporters mediate 


the transport of various substrates across membranes, including 
phospholipids and cholesterol””®. 

Single-molecule fluorescence in situ hybridization and automated 
image analysis*'' confirmed the transcriptomics results at the single- 
cell level, showing that FAK controls the abundance of Abcal tran- 
scripts in single cells to local crowding (Fig. 1c and Extended Data 
Fig. 1d, e). This adaptation involves low (1-20) and highly variable 
transcript copy numbers (Extended Data Fig. 1d), and also occurs in 
the presence of growth factors and cytokines in the medium (Extended 
Data Fig. 1f). 

Predicted candidate transcription factors (see Supplementary 
Information and Supplementary Table 2) were tested for their involve- 
ment in this adaptation using RNA-mediated interference (RNAi) in 
cells that lack FAK (FAK-KO) and thus highly express Abcal inde- 
pendent of crowding. RNAi of Foxo3, Tall and Stat4, as well as Lxrb 
(liver X receptor beta, also known as Nr1h2), the canonical transcrip- 
tion factor driving expression of ABCAI1 (ref. 12), reduced Abcal 
transcript abundance in these cells by ~50% (Extended Data Fig. 2a). 
As TALI and FOXO3 are phosphorylated by the serine/threonine 
kinase AKT, which is activated by PI(3)K downstream of FAK’, lead- 
ing to rapid degradation of TAL] (ref. 13) and inactivation of FOXO3 
(ref. 14), we focused on these transcription factors. Chromatin immu- 
noprecipitation (ChIP) experiments (Extended Data Fig. 2b) revealed 
that in cells lacking FAK, both FOXO3 and TALI bind to Abcal 
chromatin independent of cellular crowding. In cells expressing 
FAK, FOXO3 and TALI bind to Abcal chromatin at closely located 
positions only when cells experience high crowding (Fig. 2a). This is 
in contrast to LXRB, which constitutively binds to Abcal chromatin 
independent of cellular crowding or the presence of FAK (Fig. 2a). 
Furthermore, western blots of multiple adherent cell lines revealed 
that cells experiencing low crowding contain higher levels of phos- 
phorylated PI(3)K, AKT and FOXO3 and lower levels of TAL1 than 
cells experiencing high crowding. Consequently, these cells express a 
low amount of ABCA protein at low cellular crowding. Inhibition 
of PI(3)K (by wortmannin or LY-294002) lack of FAK (FAK-KO), 
or inhibition of FAK (by Y15) abolished these differences, leading to 
ABCAI expression also in cells experiencing low crowding (Fig. 2b-e 
and Extended Data Fig. 2c-e). These effects were observed in mouse 
embryonic fibroblasts, human lung epithelial cells and freshly isolated 
human keratinocytes. Micropatterns confirmed that cell crowding- 
dependent expression of ABCA1 stems from the available space of a 
single cell to adhere to, consistent with a cell-intrinsic mechanism of 
adaptation (Extended Data Fig. 2f). 

To understand if this cell-intrinsic mechanism can drive multicel- 
lular pattern formation, we applied single-cell mathematical modelling 
and computer simulation using a coupled two-level agent-based 
modelling’® and differential equation approach (Supplementary 
Information (mathematical appendix)). The agent-based model simu- 
lates the dynamic behaviour of focal adhesions (Supplementary 
Video 1) and their adhesion potential in multiple single cells of a 
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Figure 1 | Adaptation of the transcriptome to cellular crowding. a, Scatter 
plot of the log, ratio of transcript abundance in cells experiencing high 
crowding (HC) over low crowding (LC) in mouse embryonic fibroblasts 
(MEFs) expressing (FAK-WT, y axis) or lacking FAK (FAK-KO, x axis). 
Significance threshold (straight lines): |log,(LC/HC)|> 1.5. b, Gene Ontology 
enrichment network of genes suppressed by FAK in cells experiencing low 


growing cell population (Fig. 3a, Supplementary Video 2 and 
Supplementary Information). Through indirect constraints that cells 
impose on each other, emergent properties at both the single-cell and 
the cell population level arise, including the formation of regions with 
higher and lower local cell crowding and the emergence of cell polar- 
ization and directed migration, agreeing with time-lapse measurements 
of populations of proliferating cells (Extended Data Fig. 3a—d). In the 
model, the adhesion potential of each simulated focal adhesion is then 
used to promote the activation of FAK through an autophosphoryla- 
tion-based positive feedback loop (Extended Data Fig. 4a). This pre- 
dicts the appearance ofa stable pattern of activated FAK in a population 
of cells as observed in experiments (Fig. 3b, Supplementary Video 3 and 
Extended Data Fig. 5a—d). 

When modelling suppression of ABCA1 transcription downstream 
of activated FAK, we discovered that a gradual pattern of ABCA] ina 
growing cell population only emerges when intracellular signal pro- 
cessing is coupled to the timescale at which changes in cellular crowd- 
ing occur (Fig. 3b and Extended Data Figs 3e and 4b-d), adapted by a 
feedback mechanism (Fig. 3b, Extended Data Figs 3e and 6a and 
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crowding. Node colour: enrichment, node size: number of genes, edge width: 
number of overlapping genes between nodes. c, Branched DNA (bDNA) 
single-molecule FISH against Abca1 transcripts in FAK-KO (representative of 
1.2 X 104 cells) or FAK-WT (representative of 1.5 X 10° cells) MEFs 
experiencing low or high crowding. DAPI, 4’ ,6-diamidino-2-phenylindole. 
Scale bars, 10 tm. 


Supplementary Video 3). Timescale coupling could be achieved by the 
property of the membrane to act as a storage for phosphatidylinositol- 
3,4,5-triphosphate (PtdIns(3,4,5)P3 or PIP3) production by PI(3)K, 
while adaptation may be achieved by the ability of ABCA1 to alter 
physical properties of the membrane” leading to a decreased lipid order- 
ing and increased diffusion rate of lipids'’, which affects the probability 
of AKT activation on the membrane by phosphoinositide-dependent 
kinase 1 (PDK1)'*"°. We thus modelled the membrane as a ‘tunable 
capacitor’ (Fig. 3b and Extended Data Fig. 3e) that stores PIP3 and that 
can be perturbed by ABCA1 in its capacity to activate AKT. This gen- 
erates a pattern of ABCA] expression similar to experimental observa- 
tions that is insensitive to fluctuations in most parameters and 
primarily depends on the strength of ABCA1 feedback (Extended 
Data Fig. 6b-d). It also recapitulates the dynamics of ABCA1 down- 
regulation in scratch assays, when cells at high local crowding sud- 
denly become exposed to free space (Extended Data Fig. 4f, g). 

To investigate the existence of ABCA1 feedback on the capacitor 
function of the membrane, we examined whether the naturally 
observed crowding-dependent cell-to-cell variability in ABCA1 
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Figure 3 | Multi-scale model of the FAK-ABCAI system. a, Architecture of 
agent-based modelled single cells encapsulating multiple agent-based modelled 
focal adhesions. b, Model of FAK activation nested in each focal adhesion, 
influenced by the adhesion potential of each focal adhesion emerging from 

a (left, top part). Model-simulated pFAK levels in single cells (centre image, 
green signal, representative of all simulations using the same parameters, this 
run: 10° cells) and quantification (right, top graph) against local cell crowding 
without (grey, Extended Data Fig. 4a) and with (red) positive feedback (FB), 


expression causes changes in membrane lipid composition. Cells 
experiencing high crowding have a strikingly different lipid composi- 
tion than cells experiencing low crowding (Fig. 4a and Supplementary 
Table 3). Cells experiencing low crowding which expressed ABCA] at 
levels naturally found in cells experiencing high crowding. (Extended 
Data Fig. 7a) have a lipid composition more closely resembling that of 
cells experiencing high crowding (Fig. 4a and Extended Data Fig. 7b). 
In particular, cells at low crowding have a higher amount of free 
cholesterol, higher levels of cholesteryl esters (Fig. 4b and Extended 
Data Fig. 7c), more lipid droplets (Extended Data Fig. 7f), a higher 
ratio of glucosylceramide over ceramide (GlcCer/Cer) (indicative of 
glycosphingolipid biosynthesis rate), higher levels of saturated lipids, 
and lower levels of monounsaturated and polyunsaturated lipids than 
cells at high crowding (Fig. 4b and Extended Data Fig. 7d, e). In cells 
experiencing high crowding, plasmid-driven expression of ABCA] did 
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experiments in black (Extended Data Fig. 1a). Control of ABCA1 transcription 
by FAK using a tunable membrane capacitor topology, which involves PI(3)K 
and AKT and feedback by ABCA1 (left, bottom part). Model-simulated 
ABCALI levels in single cells (centre, red signal), and quantification (right, 
bottom graph) against local cell crowding without feedback (grey, Extended 
Data Fig. 4b), with direct feedback (light blue, Extended Data Fig. 4c), and with 
tunable capacitor (red). Experiments in black. 


not alter lipid composition (Fig. 4a, b, and Extended Data Fig. 7b-e). 
As a consequence, cells experiencing high crowding display lower 
membrane lipid ordering than cells experiencing low crowding 
(Fig. 4c), mediated by the crowding-dependent expression of 
ABCAI (Extended Data Fig. 7g). Cells that lack FAK and thus express 
high levels of ABCA1 contain less cholesterol and less of the glyco- 
sphingolipid GM1 and display lower membrane lipid ordering than 
cells expressing FAK (Extended Data Fig. 7h, i). 

Similarly, we found that ABCA1 levels influence the amount of 
S241-phosphorylated PDK1 and 1308-phosphorylated AKT 
(Fig. 4d). Accordingly, levels of T308-phosphorylated AKT are higher 
in cells experiencing low crowding than cells experiencing high crowd- 
ing (Fig. 4e). Pharmacological inhibition of ABCA1 abolished this 
pattern, increasing the level of T308-phosphorylated AKT in cells 
experiencing high crowding, as predicted by the model when the 
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Figure 4 | The FAK-ABCAI system adapts 
membrane lipid composition, ordering and 
signalling to local crowding. a, Hierarchical 
clustering of lipid profiles, see Extended Data 

Fig. 6b and Supplementary Table 3. P values 
determined by t-test. b, Histograms of selected lipid 
species (for free cholesterol in nmol per cell, see 
Extended Data Fig. 7c). For P values (t-test), see 
Extended Data Fig. 7d (n = 4 biological replicates, 
each the mean of 4 technical replicates, s.d.). 

c, Z-scored general polarization (GP) values (see 
Extended Data Fig. 7g) per single A431 cells (left) 
stained with Laurdan against local cell crowding 
(right) (interquartile area in grey, number of single 
cells > 3 X 10°). d, The effect of levels of ABCA1- 
GFP, randomly expressed from a plasmid in A431 
cells at low crowding on pAKT and pPDK1 in 
single cells (interquartile area in grey). e, Untreated 
(top panels) or glyburide-treated (bottom panels) 
A431 cells immunostained against pAKT (T308). 
Nucleus segmentation images are colour-coded 
for pAKT levels. Top curves (left): single-cell pAKT 
levels against local crowding in absence (grey) or 
presence of glyburide (white) (1 single cells > 10°). 
Bottom curves: model-predicted pAKT levels 
against local crowding with (grey) or without 
(white) feedback (interquartile areas in grey). 


double-negative feedback is removed (Fig. 4e). In addition, exogenous 
loading of the membrane with cholesterol and the glycosphingolipid 
GM1, as well as pharmacological inhibition of ABCA1, increases the 
level of phosphorylated PDK1 and AKT in cells lacking FAK 
(Extended Data Fig. 7j). Thus, ABCA1 inhibits the FAK-induced sig- 
nalling pathway that suppresses its own transcription by adapting 
membrane lipid composition, confirming the membrane-based feed- 
back predicted by the model as a requirement for gradual patterning. 
We made similar observations for levels of phosphorylated STAT3 and 
PAK1/2, which are respectively an effector of cytokine receptors and of 
the small GTPase RAC1, both sensitive to membrane lipid composi- 
tion (Extended Data Fig. 8)*°*’. This indicates that the adaptation of 
membrane lipid composition to local crowding by the FAK-ABCA1 
system influences multiple signalling pathways in cells, including those 
involved in cell motility and paracrine signalling. 

We have uncovered a cell-intrinsic molecular mechanism that 
allows patterning of membrane lipid composition and signalling 
according to local crowding in a cell population. Several genes with 
roles in membrane homeostasis may participate in this patterning 
system, including multiple ABC transporters and lipid-processing 
enzymes (see Supplementary Table 1, Extended Data Fig. 9 and 
Supplementary Discussion). In our minimal model, pattern formation 
of membrane lipid composition only requires variation in the extent of 
cellular crowding to emerge as cells proliferate. Patterning is subse- 
quently promoted and stabilized by feedback loops without the 
need for specific cell-cell communication. Because lipid composi- 
tion affects many membrane protein activities, adapting it to local 
crowding may have a fundamental role in controlling cellular beha- 
viour within a social context, from colony formation in unicellular 
organisms” to collective cell migration”, haematopoiesis** and 
T cell activation”, and the control of epithelial cell proliferation in 
multicellular organisms”*. 

Our work indicates a crucial role for membrane-based signalling in 
this cell-intrinsic system, in which the membrane may act as a capa- 
citor that converts signals to the correct timescale and is tuned by 
enzymes that alter membrane lipid composition and ordering in a 
feedback mechanism. Both timescale adaptation and feedback are 
required for gradual patterns in a growing cell population to emerge. 
It will now be important to unravel how such a tunable capacitor 
operates mechanistically, and to generalize this concept to the possible 
uses of cellular structures in signal computation. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 

Cell culture. Media and reagents were from GibcoBRL. Wild-type MEFs (FAK- 
WT), or knockout for FAK (FAK-KO), and A431 cells were purchased from 
ATCC. Mouse embryonic fibroblasts rescued for FAK (FAK-rescue) were a gift 
from C. Hauck (University of Konstanz, Germany). E. Reichmann and L. 
Pontiggia provided keratinocyte primary cells (UZH, Zurich). Standard growth 
conditions were the following, cells were incubated 3 to 4 days using DMEM 
containing 10% FBS and 1X glutamine (+135 pg ml’ hygromycinB for the 
FAK-rescue cells) at 37°C under 5% CO). Initial cell number was 2 X 10° to 2.5 
X 10° cells for 10-cm dishes 3 X 10‘ to 5 X 10° cells per well for 12 wells plates 
containing 13 mm coverslips and 2 X 10° to 2.5 X 10° cells per well for 96-well 
plates. All our cell lines are tested on a monthly basis for mycoplasma contam- 
ination using chemiluminescent assay. The service is independent, centralized for 
all the UZH and provided at the institute of virology of the UZH. Once the desired 
population pattern is reached (see video in ref. 3, Snijder et al. 2009) cells are serum 
deprived for approximately 12 h and used for subsequent preparations. 
Wortmannin (100 nM), Y15 (25 1M), LY-294002 (10 11M) and glyburide (25 
uM) treatments were performed over approximately 12 h before preparation. 
Coverslips were mounted on glass slide using Immu-Mount (Thermo 
Scientific), a water-based mounting medium. 

Plasmid transfection. FAK-WT cells grown in 96-well plates or 10-cm dishes 
were transfected respectively with 80 ng per well or 4 ug of ABCA1 construct 
carried in the pEGFP-N1 backbone mixed with 0.2 or 10 Ll lipofectamine2000 
following the manufacturer’s specifications. Homo sapiens ABCAI coding 
sequence was synthesized de novo and inserted between SacI and SaclI restriction 
sites. The cloned ABCA1 sequence corresponds to the full-length consensus cod- 
ing sequence CCDS6762.1. 

Cholesterol and GM1 staining. Cells were quickly washed with successive 
1X PBS, 5% delipidated BSA, 1X PBS and fixed for 4 min with 4% PFA. 
Cholesterol was stained using 0.01 mg ml’ filipin (Sigma) for 20 min, after 
two washes of 5 min in PBS, surface GM1 was stained using 0.2 pg ml’ cholera 
toxin subunit B (Alexa Fluor 555 conjugate, Invitrogen) for 10 min. 

Laurdan live staining. Cells were grown in ibidi j1-Slide 8 well chambers under 
standard conditions. Five minutes before acquisition, cells were mounted on the 
microscope (see microscope section) with environmental control and live stained 
by addition of 6-dodecanoyl-2-dimethylaminonaphthalene (Laurdan, Molecular 
Probes) and Draq5 (Cell Signaling) at 5 and 0.5 1M final concentrations directly in 
the medium. Images were acquired within the next 2 min. 

Immunostaining. Unless specified, cells were grown following standard proce- 
dures. Fixation was performed with 4%PFA for 10 min, permeabilization with 
0.1% Triton X-100 for 10 min, blocking with 1% BSA, 50 mM NH,Cl for 30 min. 
Primary and secondary antibodies were diluted in blocking solution, treatments 
were separated by two 30-min PBS washes. Secondary antibody was applied for 1 h 
(Alexa Fluor 488 or 568 goat anti rabbit antibody, Invitrogen, 1 ug ml"). Nuclear 
staining is performed with 1 4M DAPI for 10 min and cell outlines are visualized 
with Alexa Fluor 647 carboxylic acid succinimidy] ester (Life Science, 10° * dilution) 
staining for 10 min. For the pFAK staining, primary antibody was applied for 3 h 
(rabbit anti-pFAK (Y397) antibody, Cell Signaling no. 3283, 1:200) as well as for 
ABCAI (rabbit anti-ABCA1 antibody, Abcam ab7360, 1:500). For pAKT (rabbit 
anti-pAKT (T308) antibody, Cell Signaling no. 2965, 1:1,000), pPDK1 (rabbit anti- 
pPDK1 (S241), no. 3061, Cell Signaling, 1:1,000), pSTAT3 (rabbit anti-pSTAT3 
(T705) antibody, Cell Signaling no. 9131, 1:500) and pPAK1 (rabbit anti-pPAK1/2 
(T423/T402) antibody, Cell Signaling no. 2601, 1:200) staining, primary antibody 
was applied overnight at 4°C. 

mRNA bDNA-FISH experiments. FAK-WT cells were grown following stand- 
ard conditions in 96-well plates. Abcal mRNA bDNA-FISH experiments and 
image based analysis were performed using the protocol and computational 
method published by our laboratory''. Briefly, cells were fixed, permeabilized, 
and protease K treated for the Abcal mRNA specific probe set to access properly 
its target sequences. A three-step treatment with successive pre-amplifier, amp- 
lifier and fluorescent probes hybridization allows the amplification of the mRNA 
probe signal and the visualization of single Abcal mRNAs. Nuclear staining was 
performed with 1 1M DAPI for 10 min. Cell outlines were visualized with Alexa 
Fluor 647 carboxylic acid succinimidyl ester (Life Science) (10 * dilution) staining 
for 10 min. 

Microscopes. Laurdan, filipin and cholera toxin B images were acquired with 40 
magnification on a Leica SP5 confocal microscope equipped with a UV laser 
(A, 355 nm) in addition to the usual set of visible light lasers, for proper stimulation 
of Laurdan and filipin. Confocal images of pFAK were acquired on a Zeiss LSM710 
microscope with 40x magnification (Zeiss NA1.2, C-apochromat, Korr UV-VIS- 
IR), GFP-FAK total internal reflection fluorescence (TIRF) video images were 
acquired on a Nikon visiView microscope with 100X magnification. 
Immunostainings of ABCA1, pS6, pAKT, pPI(3)K, pSTAT3, pPAK1 and mRNA 


bDNA-FISH images were acquired on an automated Yokogawa CV7000 spinning 
disk microscope. 

Image analysis. All image analysis was performed using CellProfiler’’ following 
the same procedure we used in previous publications*''”*, with the help of addi- 
tional MATLAB scripts published previously for the calculation of cellular crowd- 
ing* or written specifically for this study for Laurdan image analysis (see specific 
section). The general image analysis pipeline was as follows. First, nuclei were 
detected and segmented based on the DAPI or Draq5 stain using 
IdentifyPrimaryObjects CellProfiler module. Then, cell boundaries were esti- 
mated using nuclear propagation in IdentifySecondaryObjects CellProfiler mod- 
ule. Standard CellProfiler texture, intensity, size and shape features were extracted 
from nucleus and cell regions. We additionally implemented several image ana- 
lysis steps for the purpose of detection of out of focus images and for the Support 
Vector Machine (SVM)-based classification”’ of poorly segmented nuclei. 
Membrane ordering analysis. A dedicated CellProfiler module has been 
developed for this study (the code is available upon request) for defining auto- 
matically single-cell generalized polarization (scGP) values after nuclear and cell 
segmentation. This measurement is based on a previous publication” and works 
as follows: images of cells stained with Laurdan (see specific section above for 
details) are simultaneously acquired in the 400-460 nm (I1) and 470-530 nm (12) 
wavelength windows after stimulation at 355 nm. The GP value is defined for each 
pixel following the formula: 


_n-R 
~ 11+12 


The mean GP value of each single cell (scGP value) is then defined by the mean of 
all pxGP values contained in each segmented cell. 

Microarray analysis. High and low crowding FAK-rescue and FAK-KO cells were 
grown for 24h in 10-cm dishes, in 10 ml of standard medium (described in the cell 
culture and preparation section). High crowding cells were seeded at a concen- 
tration of 10° cells per ml and low crowding cells at 0.4 X 10° cells per ml. RNA 
preparations were done with the Qiagen RNeasy Mini Kit according to the man- 
ufacturer’s manual, including the optional column DNase treatment. 

The quality of the isolated RNA was determined with a NanoDrop ND 1000 
(NanoDrop Technologies, Delaware, USA) and a Bioanalyzer 2100 (Agilent, 
Waldbronn, Germany). Only the samples with a 260/280 nm ratio between 1.8 
and 2.1 andan RNA integrity number (RIN) higher than 8 were further processed. 
Total RNA samples (100 ng) were reverse-transcribed into double-stranded 
cDNA in presence of RNA poly-A controls, RNA Spike-In Kit, One-Colour 
(Agilent product number 5188-5282). The double-stranded cDNAs were in vitro 
transcribed in presence of Cy3-labelled nucleotides using a Low Input Quick Amp 
Labelling Kit, one-colour (Agilent product number 5190-2305). The Cy3-cDNA 
was purified using an ARNeasy mini kit, Qiagen (product number 74104 or 74106) 
and its quality and quantity was determined using NanoDrop ND 1000 and 
Bioanalyzer 2100. Only cDNA samples with a total cDNA yield higher than 
2 tg and a dye incorporation rate between 8 pmol jig ' and 20 pmo jig_' were 
considered for hybridization. 

Cy3-labelled cRNA samples (1.65 jig) were mixed with a Agilent Blocking 
Solution, subsequently randomly fragmented to 100-200 bp at 65°C with 
Fragmentation Buffer, and resuspended in Hybridization Buffer using a Gene 
Expression Hybridization Kit (Agilent product number 5188-5242). Target 
cRNA Samples (100 yl) were hybridized to Whole Mouse Genome 444k 
OligoMicroarrays (Agilent G4122F) for 17 h at 65°C. Arrays were then washed 
using Agilent GE Wash Buffers 1 and 2 (Agilent product number 5188-5326), 
according to the manufacturer’s instructions (One-Colour Microarray-Based 
Gene Expression Analysis Manual, http://www.agilent.com). An Agilent 
Microarray Scanner (Agilent product number G2565BA) was used to measure 
the fluorescent intensity emitted by the labelled target. The microarray data set has 
been uploaded to the NCBI Gene Expression Omnibus as record GSE43873, 
reorganized and filtered data can be downloaded in the Supplementary 
Information section (MicroarrayData.xls). 

Functional enrichment analysis. The Gene Ontology term enrichment analysis 
was done with DAVID*'** on genes significantly more expressed (absolute 
log>(low/high crowding) gene expression value over 1.5) in FAK-expressing cells. 
Functional groups shown in the two networks have an enrichment value superior 
than 2 and are composed of at least 5 genes. 

Selection of candidate transcription factors. The 19 transcription factors 
screened in the FAK-KO cells for their potential effect on Abcal mRNA expression 
were selected using a combination of three approaches. (1) Candidates have a 
binding site in all of the top 10 FAK suppressed genes defined with the microarray 
data. To perform this comparison, we used the Pscan algorithm (http://www. 
beaconlab.it/pscan) with the JASPAR database*’ (http://jaspar.genereg.net/). 
(2) Transcription factors having the strongest GO enrichment for lipid 
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homeostasis or (3) having a reported ChIP binding site or an effect on expression 
for ABCA1 in the literature (Supplementary Table 2). 

siRNA experiments. All siRNAs were purchased from Qiagen. FAK-KO cells 
were cultured in 24-well plates, using standard conditions until reaching approxi- 
mately 60% confluency (48-60 h) and transfected by forward transfection. Per 
well, 25 pmol samples of siRNA were mixed in 25 ll of Opti- MEM and 0.5 pl of 
Lipofectamine RNAiMAX were mixed with 24.5 il of Opti- MEM. After 5 min of 
incubation, solutions were mixed together and incubated for another 20 min at 
room temperature and transferred on the cultured cells for 60 h before RNA 
preparation. 

qPCR screening. Silenced FAK-KO cells were washed with 1x PBS, RNA samples 
were prepared using NucleoSpinRNAII kit (Macherey Nagel), cDNA synthesis 
was carried out with the Transcriptor High Fidelity cDNA Synthesis Kit (Roche) 
using poly-dT primers, in both cases following the manufacturer’s protocol. 
Quantitative real-time PCR was performed in 384-well plates in an AB7900HT 
qPCR device (Applied Biosystems) using the following primers, forward ABCA1: 
5'-CTGTAGACCTGGAGAGAAGCTTTC-3’, reverse ABCA1: 5’-CAGCTCCA 
TGGACTTGTTGATGAG-3’ allowing amplification over the twelfth and thir- 
teenth exons contained in all ABCA mRNA variants, and forward GAPDH: 5’- 
TCAAGGCTGAGAACGGGAAGCTTG-3’, reverse GAPDH: 5’-AGCCTTCT 
CCATGGTGGTGAAGAC-3’. Relative mRNA amounts were calculated using 
GAPDH as an internal reference. 

Western blotting. A431, FAK-WT and FAK-KO cells were cultured using stand- 
ard conditions in 10-cm dishes. Low crowding cells were stopped after 2 to 2.5 days 
of growth, whereas high crowding cells were grown for 6 days (both including a 
final 12 h of serum starvation). Cells were washed with 1 X PBS and disrupted in 
lysis buffer (0.5% sodium deoxycholate, 150 mM NaCl, 50 mM Tris-HCl, pH 7.2, 
0.1% SDS, 1% Triton X-100, 0.2% NaN3), and 15 jg of each protein extract was 
separated using 10% PAGE except for ABCA1 western blotting where 50 1g of 
protein and 8% PAGE were used. Separated proteins were then transferred onto a 
membrane (Immobilon-P, 0.45 jum, Millipore) using the humid chamber method. 
Transfer conditions are 80 mA overnight for ABCA 1 western blotting, 250 mA for 
90 min otherwise. Membranes were blocked with 4% BSA proteins in 1X TBS-T 
(1X TBS, 0.1% Tween) for 1 h. Primary antibodies rabbit anti-pFAK (Cell 
Signaling no. 3283), rabbit anti-pPI(3)K (rabbit anti-pPI(3)K p85/p55 (T458/ 
T199) antibody, Cell Signaling no. 4228), rabbit anti-pAKT ((T308) Cell 
Signaling no. 2965) were diluted at 1:1,000 and rabbit anti-actin (Cell Signaling 
no. 8456) at 1:5,000. Rabbit anti-TAL1 (Sc-12984, Santa Cruz) and rabbit 
anti-pFOXO3 (S253, no. 9466, Cell Signaling) were diluted at 1:200 and rabbit 
anti-ABCA1 (Abcam ab7360) at 1:500 in blocking buffer. HRP-conjugated 
secondary anti-mouse (no. 170-6516, BioRad) and anti-rabbit (no. 170-6515, 
BioRad) antibodies were diluted at 1:5,000 in the same buffer. Primary and 
secondary antibodies were applied overnight at 4°C and 60 min at room temper- 
ature, respectively. Signal was revealed with HRP substrate solution and imaged 
with a CCD camera (for antibody references see immunostaining section). 
ChIP experiments. FAK-KO and FAK-WT cells were cultured using standard 
conditions in 10-cm dishes. Low crowding cells were stopped after 2 to 2.5 days of 
growth, whereas high crowding cells were grown for 6 days (both including a final 
12 h of serum starvation). Experiments were carried out using the Chromatin 
Immunoprecipitation (ChIP) Assay Kit from Millipore following manufacturer’s 
specifications except for the following changes. Fixation of cells was performed 
with 1.6 mM Di-thio bis-succinimidyl propionate (DSP) for 20 min, two short 
washes with 1X PBS at room temperature, and finally 1% paraformaldehyde for 
20 min. 20 pg of anti-TAL1 (Sc-12984, Santa Cruz), anti-FOXO3 (07-702, 
Millipore) and anti-LXR beta (Sc-34341, Santa Cruz) primary antibodies was 
added for 15 h at 4°C to the pre-cleared supernatant. Protein A beads were then 
added for 4 h. Reversion of crosslinking was done for 12 h at 55°C. 

Lipid mass spectrometry 

Chemicals and lipid standards. DLPC 12:0/12:0 (850335), PE 17:0/14:1 (PE31:1, 
LM-1104), PI 17:0/14:1 (PI31:1, LM-1504), PS 17:0/14:1 (PS31:1, LM-1304),C17:0 
ceramide (860517), C12:0 SM (860583) and Glucosyl C8:0 Cer (860540) were used 
as internal lipid standards and were purchased from Avanti Polar Lipids Inc. 
(Alabaster, AL). Ergosterol was used as sterol standard and was purchased from 
Fluka (Buchs, Switzerland). Methyl tert-butyl ether (MTBE) was from Fluka 
(Buchs). Methyl amine (33% in absolute ethanol) was from Sigma Aldrich 
(Steinheim, Germany). HPLC-grade chloroform was purchased from Acros 
(Geel, Belgium), liquid chromatography-mass spectrometry (LC-MS) grade 
methanol and LC-MS grade ammonium acetate were from Fluka. LC-MS grade 
water was purchased from Biosolve. 

Cell culture. FAK-WT cells were cultured using standard conditions in 10-cm 
dishes. Low crowding cells were stopped after 2.5-3 days of growth while high 
crowding cells were grown for 6 days (both including a final 12 h of serum 
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starvation). Cells were transfected with a human ABCA1-containing plasmid as 
described above or subjected to the transfection procedure without plasmid after 
one day of culture for low crowding cells or four days of culture for high crowding 
cells. Cells facing low or high crowding were collected two days after transfection. 
Cells were shortly washed with successively 1X PBS, 5% delipidated BSA, and 
three times with cold 1X PBS, scraped and pelleted at 800g for 5 min before lipid 
extraction. 

Lipid analysis. Lipid extracts of 4 biological replicates of each of the 4 conditions 
(high crowding; high crowding + ABCA1; low crowding; low crowding +ABCA1) 
were prepared using the MTBE protocol** and measurements were made in 
4 technical replicates, amounting to a total of 64 measurements at each mass 
spectrometer. Cell pellets were resuspended into 100 pl of water and transferred 
into a 2 ml Eppendorf tube. Then 360 ul methanol and a mix of internal 
standards were added (400 pmol DLPC, 1,000 pmol PE31:1, 1,000 pmol 
PI31:1, 3,300 pmol PS31:1, 2,500 pmol C12SM, 500 pmol C17Cer and 100 pmol 
C8GC). Samples were vortexed and 1.2 ml of MTBE was added. Samples were 
placed for 10 min on a multitube vortexer at 4°C (Lab-tek International) fol- 
lowed by an incubation for 1 h at room temperature on a shaker. Phase sepa- 
ration was induced by addition of 200 jl MS-grade water. After 10 min of 
incubation at room temperature, samples were centrifuged at 1,000g for 
10 min. The upper (organic) phase was transferred into a 13 mm glass tube 
with a Teflon-lined cap and the lower phase was re-extracted with 400 pl 
artificial upper phase (MTBE/methanol/H2O0 10:3:1.5). In total, 1,500 pl of 
organic phase was recovered from each samples, split into three parts and dried 
in a CentriVap Vacuum Concentrator (Labconco). One part was treated by 
alkaline hydrolysis to enrich for sphingolipids and the other two aliquots were 
used for glycerophospholipid/phosphorus assay and sterol analysis, respect- 
ively. Glycerophospholipids were deacylated according to the method by 
Clarke & Dawson”. Briefly, 1 ml freshly prepared monomethylamine reagent 
(methylamine/H,O/n-butanol/methanol at 5:3:1:4 (vol/vol)) was added to the 
dried lipid extract and then incubated at 53°C for 1 h in a water bath. Lipids 
were cooled to room temperature and then dried. For desalting, the dried lipid 
extract was resuspended in 300 pil water-saturated n-butanol and then extracted 
with 150 pil H,O. The organic phase was collected, and the aqueous phase was 
reextracted twice with 300 jl water-saturated n-butanol. The organic phases 
were pooled and dried in a CentriVap Vacuum Concentrator. 

Sterols analysis by gas chromatography-mass spectrometry (GC-MS). One- 
third of total lipid extract was resuspended in 500 jl of MS-grade chloroform/ 
methanol (1:1) solution and injected into a VARIAN CP-3800 gas chromatogram 
equipped with a Factor Four Capillary Column VF-5ms 15 mm X 0.32 mm id. 
DF = 100. Identification and quantification of sterol species were performed using 
a VARIAN 320MS as described in ref. 36. 

Phospholipids and sphingolipids analysis by electrospray ionization mass 
spectrometry (ESI-MS). Identification and quantification of phospholipid and 
sphingolipid molecular species were performed using multiple reaction monitor- 
ing with a TSQ Vantage Triple Stage Quadrupole Mass Spectrometer (Thermo 
Scientific) equipped with a robotic nanoflow ion source, Nanomate HD (Advion 
Biosciences). Each individual ion dissociation pathway was optimized with regard 
to collision energy. Lipid concentrations were calculated relative to the relevant 
internal standards as described in ref. 37 and then normalized to the total phos- 
phorus content of each total lipid extract to adjust for difference in cell size, 
membrane content, and extraction efficiency. 

Determination of total phosphorus content. The dried total lipid extract was 
resuspended in 250 yl chloroform/methanol (1:1) and 50 ull were placed into a 13 
mm disposable pyrex tube. The solvent was completely evaporated and 0, 2, 5, 10, 
20 ul of a3 mM KH,PO, standard solution were placed into separate pyrex tubes. 
To each tube 20 1l of water and 140 pl of 70% perchloric acid were added. Samples 
were heated at 180°C for 1 h in a hood. Tubes were then removed from the block 
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Extended Data Figure 1 | Adaptation of the transcriptome to cellular 
crowding. Related to Fig. 1. a, Immunofluorescence against phosphorylated 
FAK (Y397) in a population of A431 cells, corresponding curve shows single- 
cell phosphorylated FAK signals against local cell crowding (interquartile area 
is shown in grey, number of cells >10*). b, Gene Ontology enrichment network 
of genes that are induced by FAK in cells experiencing low crowding. Greyscale 
indicates enrichment, node-size number of genes, edge width between nodes 
number of overlapping genes. c, Histogram of ABC transporters more 
expressed in cells lacking FAK compared to cells expressing FAK when facing 
low crowding. d, Single-cell transcript counts of Abcal in 1.2 X 10* FAK-KO 
and 1.5 X 10’ FAK-WT cells experiencing increasing levels of local crowding 
(interquartile area in grey). e, Control experiment of bDNA single-molecule 


FISH against bacterial dapB transcripts in FAK-KO or FAK-WT cells 
experiencing low crowding or high crowding. Representative of 10° cells. 

f, Real-time PCR measurements of Abca1 transcripts in cells at low and high 
local crowding in both FAK-expressing and FAK-KO cells in the presence of 
10% FCS. Clearly, Abcal mRNA levels are much higher in FAK-expressing cells 
facing high crowding than in the same cells facing low crowding (s.d.,n = 4 
biological replicates each made of 3 technical replicates, P< 10 1°, t-test) but 
also in FAK-KO cells compared FAK-expressing cells (s.d., n = 4 biological 
replicates each made of 3 technical replicates, P< 10 10 +-test). This indicates 
that FAK-dependent adaptation of Abca1 transcription to cell crowding also 
operates in the presence of an abundant and homogeneous amount of growth 
factors and cytokines in the medium. 
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Extended Data Figure 2 | FAK suppresses ABCA] expression in cells at low 
crowding via TAL1 and FOXO3 in a cell-intrinsic way. Related to Fig. 2. 

a, Percentage reduction of Abcal mRNA in FAK-KO cells upon silencing of 19 
potential transcription factors. b, Table of primers used for (RT-PCR 
amplification of Abcal DNA and corresponding genomic position. c, Western 
blots of pFAK, pPI(3)K and pAKT levels in FAK-WT and FAK-KO MEFs, and 
A431 cells at low crowding, high crowding or low crowding + wortmannin. 
d, Real-time PCR quantification of Abcal mRNA shows that treatment with 
LY-294002 alleviates the inhibitory effect of FAK on Abca] transcription in 
cells (at low crowding) expressing FAK (s.d., n = 4 biological replicates each 
made of 3 technical replicates, P< 10°, t-test), whereas this treatment has no 


significant effect on Abca1 transcription in cells that lack FAK (s.d., n = 4 
biological replicates each made of 3 technical replicates, P > 0.1, t-test). 

e, Immunofluorescence imaging of ABCA1 over a population of A431 cells 
in the presence of Y15 FAK inhibitor and related projection of single cell 
measurements onto nuclear segmentations. f, Quantifications of Abcal protein 
expression in FAK-WT cells adhering to micropatterned surfaces of large 
(10,000 ttm?) or small (2,000 jum?) area (http://www.cytoo.com) at long 
distance from potentially secreting neighbouring cells. This shows that 

space constraints are sufficient to trigger differences in Abcal expression 
(s.d., n = 100 cells, P< 10-4, t-test). 
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Extended Data Figure 3 | Agent-based modelled single cells show 
characteristics similar to tracked cells. a, Typical curve of the growth of the 
nucleus size of a single cell between two mitotic events (centre). Distribution of 
measured (number of tracks: 650) and agent-based modelled (number of 
tracks: 200) single-cell nucleus sizes (right histograms) and cell-cycle lengths 
(bottom histograms). Black, raw data, red, fitted Gaussian curve. Agent-based 
modelled cells and measured cells show similar distributions in cell-cycle length 
and nucleus size. b, Curve showing single-cell mean nuclear area against local 
cell crowding of measured (black, number of cells: >10*) and agent-based 


modelled cells (red, number of cells: >10°). c, Histograms of single-cell area 
distribution of measured (number of cells: >10*) and agent-based modelled 
cells (number of cells: >10°) showing that distribution of emerging cell areas of 
modelled cells are matching those of measured cells even for extreme values. 
d, Histograms of single-cell mean square displacement distribution of 
measured (number of tracks: 650) and agent-based modelled cells (number of 
tracks: 200). e, Timescales of information sensing and processing steps in the 
FAK-ABCAI system. Absence of a capacitor does not allow gradual patterns to 
emerge (switch-like behaviour). 
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Extended Data Figure 4 | Alternative models do not lead to the emergence 
of gradual patterns in ABCA1 expression, and the full model recapitulates 
experimentally observed dynamics of reduction in ABCA1 expression in 
scratch assays. Conclusions are parameter-independent, for details see 
mathematical appendix in the Supplementary Information. a, A FAK activation 
model without autophosphorylation does not result in a pFAK pattern in an 
agent-based modelled cell population. b, A FAK-ABCA1 model based on free 
diffusion of signalling molecules without or with c, addition of a putative direct 
inhibitory effect of ABCA1 on its own suppression does not result in a 
patterning of ABCA] expression. d, Introduction of a membrane relay for AKT 
activation without ABCA1 feedback on the membrane relay does not result ina 
patterning of ABCA] expression. e, Simulated single-cell ABCA1 variability 
over local crowding is similar to the variability seen in our experiments 


(see Fig. 2d). f, Scratch assays, at which cells at high crowding suddenly become 
exposed to free space to spread and followed over time, show that reduction of 
ABCAI levels in these cells has a half-maximum effect at ~50 min, and full 
effect at ~200 min. g, This is in agreement with simulations of scratch assays 
using our cell-intrinsic Agent-based model of the FAK-ABCA] system. The 
process was iterated thousands of times with random starting levels of 
ABCAI similar to the variability seen in the experimental scratch assay. 

20 representative curves are shown. In the simulations, it takes ~150 min 

for the disappearance of half of ABCA1. h, Distributions of pixel GP values of 
FAK-KO cells stained with Laurdan at different time-points after treatment 
with glyburide. After just 20 min of drug treatment, the membranes of these 
cells become more ordered (P < 10° !°°, t-test, pixel distributions at each time 
point are made from 2 X 10° cells). 
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no effect on the formation of a pFAK pattern even if ka is bigger than k, by 
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where ABCA1 is either expressed or not in all cells of the population, 
independent of local cell crowding. Inhibition power represents the ABCA1 
competitive inhibitory power. c, Heat map representing Euclidian distance 
between modelled and measured levels of ABCA1 in single cells as a function of 
local crowding when Trsh1 and Trsh2 vary in model C. d, The capacity of 
model C to generate a gradual expression pattern (low Euclidian distance is 
black) does not depend on k3 and 3’, and HL1 and 1’, demonstrating the central 
role of the membrane relay for gradual patterns to emerge. 
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Extended Data Figure 7 | The FAK-ABCAI system adapts membrane lipid 
composition, ordering and signalling to local crowding. Related to Fig. 4. 
a, Histogram of transcript copy number (number of spots) per cell determined 
with bDNA single-molecule FISH against endogenous Abca1] in cells at high 
crowding, or against ABCA1-GFP transcripts in cells at low crowding 
transfected with the pEGFP-N1-ABCAI construct. This shows that plasmid- 
driven ABCA 1-GFP expression in cells at low crowding does not exceed that of 
endogenous Abca1] levels in cells at high crowding. b, Hierarchical clustering of 
lipid profiles of mouse embryonic fibroblasts grown at high crowding or low 
crowding conditions and transiently expressing ABCA1 from a plasmid 
(+ABCAI) or not. The clustergram shows the 48 lipid species that represent 
80% of the total lipid amount. Colours correspond to pmol/pmol total lipid 
z-scored over the four conditions, colours of lipid names refer to their clusters. 
For complete lipid mass spectrometry data, see Supplementary Table 3. 

c, Histograms displaying the quantity of free cholesterol in nmol per cell (n = 4 
biological replicates, each the mean of 4 technical replicates, s.d.). d, P values 
related to the bar graphs in Fig. 4c. e, Pie charts representing the percentage of 
saturated, monounsaturated and polyunsaturated lipids for the four different 
conditions. f, Fluorescence imaging using Bodipy 493/503 dye of lipid 
droplets in low crowding (n = 5 X 10° single cells) or high crowding conditions 


(n = 5 X 10° single cells). This confirms that cells at low crowding contain a 
larger amount of cholesteryl-esters, which are stored in lipid droplets. 

g, Diagram summarizing the method to measure membrane ordering of a 
formaldehyde fixed population of cells at the single-cell level (left flow chart). 
Distributions of single-cell GP values for groups of cells that are the top 20, 100, 
200, 300 ABCA1-GFP expressing cells compared to all cells (top right 
distributions, n = 500 cells) and curve showing the relationship between single- 
cell ABCA1 expression and scGP value (bottom right curve, n = 500 cells). 
h, Image-based quantification of free cholesterol (filipin), GM1 content 
(cholera toxin B binding or anti-GM1 antibody) and lipid ordering 
(Laurdan, as in panel d) in single MEFs with (FAK-WT) or without FAK 
(FAK-KO). n = 4 experiments, each >10* cells. *P values (t-test) < 107*. 

i, Because some GM1 may not be accessible in formaldehyde-fixed cells, we 
performed dot blot analysis of lipid extracts from FAK-KO and FAK-WT cells 
using HRP-conjugated cholera toxin B. This indicates that FAK-WT cells 
have higher levels of GM1 than FAK-KO cells. j, pAKT and pPDK1 
immunostaining in cells without FAK (FAK-KO) exogenously loaded with 
GM1 and cholesterol (FAK-KO + GM1 + Chol.), treated with DMSO, or 
with 10 and 25 [iM glyburide in DMSO (n = 3 experiments, each 10* cells, s.d., 
*P values (t-test) <10 7. 
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Extended Data Figure 8 | Phosphorylation of STAT3 and PAK1/2 are 
sensitive to ABCA1-mediated membrane perturbation. a, Curve showing 
the relationship between ABCA1-GFP expression and phosphorylated STAT3 
(T705) and PAK1/2 (T423/T402) amounts in single cells. b, Quantification of 


LETTER 


FAK-KO FAK-KO 


+GM1+Chol. 


pPAK1/2 


FAK-KO 


FAK-KO 
+GM1+Chol. 


immunostaining of phosphorylated STAT3 (T705) and PAK1/2 (T423/T402) 
amounts in FAK-KO cells after exogenous loading of the plasma 

membrane with cholesterol and GM1 (s.d., n = 4 experiments, each with 

10* cells, t-test). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


A B 


Gene list — & 
Oo 
Module 1 
Get gene data 
Name 
sequence 
chromosome 
position 
orientation 
v 
Gene metadata Module 2 ABCA10 
Get profiles 
ae ABCC11 
if 
ENCODE database ABCB4 
lfor each gene ABCA8 
a ABCE1 
TF binding Profiles Module 3 
Filter profiles AB CG8 
Position specific ABCG5 


peak validation 


qn 


Cell line average 
Replicate average 01 2 3 


z-score binding strength 


v 
TF impact per gene Module 4 
Cluster genes 
Gene clustering 


according to their 
TF impact profiles 


Genes 


Transcription factor impacts 


Extended Data Figure 9 | Hierarchical clustering of human ABC ABCAI, A9, A6 and G1 that share Tall binding (see bar graph representation of 
transporters according to 118 transcription factor binding profiles fromthe Tall binding on the right). These 4 ABC transporters are the same 4 ABC 
ENCODE database. a, Diagram of the algorithm used to generate ABC transporters that were found higher expressed in cells lacking FAK (FAK-KO) 


transporter clusters. b, Heat map of the cluster of ABC transporters containing (see Extended Data Fig. 1c). 
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Mechanical induction of the tumorigenic B-catenin 
pathway by tumour growth pressure 
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Heldmuth Latorre-Ossa’, Colette Rey’, Laura Fouassier*, Audrey Claperon‘, Laura Brullé’, Elodie Girard®, Nicolas Servant®, 
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Christine Ménager’, Silvia Fre’, Sylvie Robine’® & Emmanuel Farge’ 


The tumour microenvironment may contribute to tumorigenesis 
owing to mechanical forces such as fibrotic stiffness or mechanical 
pressure caused by the expansion of hyper-proliferative cells’. 
Here we explore the contribution of the mechanical pressure 
exerted by tumour growth onto non-tumorous adjacent epithe- 
lium. In the early stage of mouse colon tumour development in 
the Notch*t Apc*’ 1638N mouse model, we observed mechanistic 
pressure stress in the non-tumorous epithelial cells caused by 
hyper-proliferative adjacent crypts overexpressing active Notch, 
which is associated with increased Ret and f-catenin signalling. 
We thus developed a method that allows the delivery of a defined 
mechanical pressure in vivo, by subcutaneously inserting a magnet 
close to the mouse colon. The implanted magnet generated a mag- 
netic force on ultra-magnetic liposomes, stabilized in the mesench- 
ymal cells of the connective tissue surrounding colonic crypts after 
intravenous injection. The magnetically induced pressure quanti- 
tatively mimicked the endogenous early tumour growth stress in 
the order of 1,200 Pa, without affecting tissue stiffness, as mon- 
itored by ultrasound strain imaging and shear wave elastography. 
The exertion of pressure mimicking that of tumour growth led to 
rapid Ret activation and downstream phosphorylation of f-catenin 
on Tyr654, imparing its interaction with the E-cadherin in adhe- 
rens junctions, and which was followed by f-catenin nuclear trans- 
location after 15 days. As a consequence, increased expression of 
B-catenin-target genes was observed at 1 month, together with 
crypt enlargement accompanying the formation of early tumorous 
aberrant crypt foci. Mechanical activation of the tumorigenic 
B-catenin pathway suggests unexplored modes of tumour propaga- 
tion based on mechanical signalling pathways in healthy epithelial 
cells surrounding the tumour, which may contribute to tumour 
heterogeneity. 

To test the tumorous impact of early tumour growth pressure on 
non-tumorous epithelial tissues in vivo, apart from the mechanical 
stiffness characteristic of the microenvironment of late tumours’ ® 
(see Supplementary Information 1), we first measured the strain 
deformation of live tumorous colonic crypts ex vivo under spinning 
disk microscopy. We used Notch/Apc mice, characterized by hyper- 
proliferative Notch overexpressing crypts within the Apc heterozygous 
genetic background (Notch* Apc*/'®**%)’, Colon crypt stress was 
deduced from their strain deformation induced by Notch* hyper- 
proliferative crypts, elasticity measurements, and Hooke’s law. It 
was found to range from 0 to 2.4 kPa at 2 weeks (S = 1.2 + 1.2 kPa 


(mean + s.d.)), and from 0.4 to 5.6 kPa at 1 month (S = 3 + 2.6 kPa) 
(Supplementary Information 2 and Extended Data Figure la—c). These 
values are consistent with the pathophysiological oncogene growth- 
induced stress of a few kPa induced by tumour growth”*”. 

To apply and mimic a mechanical pressure in the same order of 
magnitude as the one generated by hyper-proliferative crypts in vivo, 
we developed a method based on the intravenous injection of stable 
ultra-magnetic liposomes (UML) encapsulating super-paramagnetic 
iron oxide nanocrystals. A 3-mm cylindric magnet was inserted sub- 
cutaneously in front of the colon, which caused loading of the colon 
tissue with the liposomes (Extended Data Figs 1d, 2a and 
Supplementary Information 3). Confocal microscopy showed UML 
loading, in the stromal cells surrounding the distal colon, one week 
after injection (Extended Data Fig. 2b). This remained stable, concen- 
trated at 250 + 50 nmol of Fe(III) per g of tissue, during 1 month 
(Extended Data Fig. 2c). Immuofluorescence tissue analyses revealed 
UML internalisation within the mesenchymal cells surrounding the 
crypts, as determined by vimentin co-labelling (Fig. 1a). 

The next step was to measure the tissue stress associated with the 
presence of UML exposed to a magnetic field gradient in the colon of 
wild-type mice. The compression strain generated by the 3 mm (0.12 T) 
magnet on the UML-loaded tissue along the y direction of the wild- 
type colon was found to be ¢ = 4.3 + 2.1% (Fig. 1b, Extended Data Fig. 
2a, d-f and Supplementary Information 4). Quantitative Young’s 
modulus maps of wild-type colons revealed a mean stiffness value 
of E = 30.1 + 3.5 kPa (mean Young’s modulus), which was not 
significantly changed by magnetic loading to Ey, = 35 + 3 kPa 
(Extended Data Fig. 2g). These measurements enabled an estimation 
of a mean local uniaxial stress S = E X ¢ = 1.194 + 0.610 kPa generated 
by the magnet on the colon, making the assumption of an almost 
uniaxial strain (along y direction from the colon to the magnet) and 
consequently a one-dimensional Hooke’s law (S = E X e) approxi- 
mation. This showed the existence of 100-j1m acoustic resolution 
domains of stress ranging from 0.6 to 1.8 kPa. 

Notably, these values are consistent with the pathophysiological 
stress induced by tumour growth, as reported in the literature**”. 
They are equivalent to the mechanical stress we measured in aberrant 
crypts generated in Notch/Apc mutant mice after 2 weeks to 1 month 
after aberrant crypt foci (ACF) development (Extended Data Fig. 1a). 

We then applied this magnetically induced pressure in vivo, to the 
colon of Ape single-mutant mice (Apc*/1°**), in which the B-catenin 
signalling pathway was specifically found to be mechanically activated 
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Magnetic loading in vivo 
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y Vv 
€ = €ymuam ~ fc = 4-3 = 2-1% 
Figure 1 | Stable magnetic loading of the colon tissue to mimic tumour 


growth pressure in vivo. a, Rhodamine-labelled UML observed 1 week after 
injection in wild-type colon explants, in contrast to non-injected controls, 
localized in the connective tissue between the crypts. Co-localization of UML 
(in red) with vimentin (in green) (observed in n = 2 mice, in the 200 crypts 
analysed). Scale bars are 10 jum. b, Strain map images of wild-type colon 1 week 
after UML injection. Mean local strain compression of magnetized wild-type 
colon (measured in n = 6 mice) compared to non-magnetized colon (measured 
in n = 3 mice) is € = 4.3 + 2.1%. 


in response to 20-min perturbation ex vivo’’. This resulted in an onco- 
genic response leading to the increased expression of the B-catenin 
target tumorigenic Myc, Axin2 and Zeb1'°” after 1 month of pressure 
application in the crypts (Fig. 2 and Extended Data Fig. 3a). 
Quantification by qRT-PCR (quantitative reverse-transcriptase 
PCR) revealed an increase in expression of Myc, Axin2 and Zeb1 by 
factors of 2.3, 2.5 and 3.5, respectively, due to magnetic pressure mim- 
icking growth-induced mechanical tissue stress (Extended Data Fig. 
3a). 

Consistent with this transcriptional response, after 2 weeks of mech- 
anical stress, we observed an increase of the phosphorylation of B-cate- 
nin on Y654 which prevents its interaction with junctional E-cadherin 
and allows its release from junctions’* (Fig. 2, yellow arrows). 
Consistently, the phospho-f-catenin did not co-localize with sub-api- 
cal junctional E-cadherins, but was detected more apically at the brush 
border, as assessed by co-localization with villin (Extended Data Fig. 
4b). Apc-mutant crypts displayed an increased number of cells with a 
cytoplasmic enrichment in B-catenin, as well as nuclear enrichment, as 
shown by DNA co-staining with DAPI (Extended Data Fig. 3b, stained 
in white), attesting B-catenin accumulation both in the cytoplasm and 
nucleus in intestinal epithelial cells'*. Consistent with the mechanical 
induction of Myc (Fig. 2), we observed an increase in the number of 
Ki67-positive proliferative cells (by 1.5-fold compared to control mice) 
after 2 weeks of mechanical stress (Extended Data Fig. 4a and 
Supplementary Note), and a consequent increase in the number of 
crypts larger than 1,500 jum? (Extended Data Fig. 3c). This corre- 
sponded to an increase in mean crypt size of 22% after 1 month, which 
is similar to the value recorded during the appearance of ACF in 
Notch/Apc mice, 1 month after Notch activation (Extended Data 
Fig. 3c). 

Using small-animal colonoscopy, we could visualize the mechanical 
induction of ACF initiating 5 days after stress application, leading to 
adenoma development after 15 days in Apc’''®’°N mice (Extended 
Data Fig. 5a, b and Supplementary Information 5 with associated 
Extended Data Fig. 2c, h, i). Notably, no loss of Apc heterozygosity’ 
could be detected in the epithelia in response to mechanical pressure at 
this time, in agreement with mechanically induced ACF formation 
(Extended Data Fig. 5c). 
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Figure 2 | Activation of the Wnt/B-cat pathway by magnetic deformation 
mimicking tumour growth in vivo, in Apc-deficient mice colon. a, Top, Myc 
expression in the crypts, 1 month after the initiation of mechanical strain with 
UML and magnet mimicking tumour strain (measured in n = 5 mice, 487 
crypts analysed), in control (measured in n = 3 mice, 352 crypts analysed) and 
mice injected with UML without magnet (measured in n = 2 mice, 167 crypts 
analysed), observed by immunofluorescence. Bottom, B-catenin Y654 
phosphorylation after 15 days under UML with magnet conditions (measured 
in n = 2 mice, 112 crypts analysed), compared to control (measured in n = 2 
mice, 129 crypts analysed) and to mice injected with UML without magnet 
(measured in n = 2 mice, 92 crypts analysed). Scale bars are 10 jim. 

b, Quantification of a. P > 0.05 for all control versus UML comparisons. ***P 
< 0.001, Student’s t-test for comparison between all UML + magnet versus 
UML without magnet. 


Therefore, a magnetic pressure mimicking the strain generated by 
tumour growth triggered the activation of the Wnt/-catenin pathway, 
leading to the mechanical induction of Myc, Axin2 and Zeb1 express- 
ion, genes critically involved in tumour growth, and to ACF formation 
in Apc*/1®8N mice in vivo. 

Then, we aimed to gain insights into the molecular mechanisms 
underlying the observed pro-tumorigenic effects of magnetically 
induced pressure. We found Ret mechanical phosphorylation as 
required upstream of the tumorigenic mechanosensitive pathway, ex 
vivo (Extended Data Figs 4d, e, g, 6a, b and Supplementary 
Information 6). To test whether tumour growth pressure could activ- 
ate Ret in vivo, we first applied the magnetically induced pressure 
mimicking the o = 1.194 + 0.660 kPa tumour-growth-induced stress 
in vivo, to heterozygous Apc*''®*8N mutant mice. We observed Ret 
phosphorylation in Ape */1°*8% crypts at 1 week of induced mechanical 
stress (Fig. 3 and Extended Data Fig. 6c), initiating as early as after 30 
min coincidental with UML accumulation (Extended Data Fig. 7a, b 
and Supplementary Information 6), consistent with the rapid 
dynamics associated with mechanotransductive processes and a direct 
mechanical activation of Ret. Ret activation, and downstream [-cate- 
nin Y654 phosphorylation and Myc induction, were all found to be 
independent of any indirect effect of UML loading per se (Figs. 2, 3 
and Extended Data Fig. 7c, d), or long-range effects of liposomes or 
magnets either in the colon or outside the colon (Supplementary 
Information 7). We consistently found no expression of the Ret canon- 
ical ligands in the colon GDNF, artemin and neurturin (Extended Data 
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Figure 3 | Mechanical activation of pY1062 Ret kinase after in vivo 
magnetic deformation of Apc-deficient mice colon. a, Phosphorylated Ret 
Y1062 (pRet) in Ape*/’®*N colon explants subjected to mechanical strain with 
UML and magnet mimicking tumour strain at 1 week (measured in n = 4 mice, 
697 crypts analysed) compared to controls without UML (measured in n = 4 
mice, 556 crypts analysed) and with UML without magnet (measured in n = 3 
mice, 627 crypts analysed). Scale bar is 10 jum. b, Quantification of a. P > 0.05 
for control to UML comparison. ***P < 0.001, Student’s t-test. 


Fig. 7e), which confirmed Ret mechanical activation in these experi- 
mental conditions. 

We additionally found Akt-dependent Ret mechanical activation 
upstream of GSK-3f inactivation, a protein that prevents nuclear 
accumulation of the f-catenin released into the cytoplasm 
(Extended Data Fig.7f, g and Supplementary Information 8)'*"”. 

Importantly, when we applied a magnetic pressure generating a 
tissue stress corresponding to S = 1.194 + 0.660 kPa to wild-type 
colon, we obtained the same results as in Ape Pan mice, namely 
Ret activation, B-catenin phosphorylation and release from sub-apical 
junctions and co-localized with villin and increased nuclear transloca- 
tion of b-catenin (Extended Data Figs 4c and 6d, e). The observation of 
Myc overexpression and ACF formation with induced internal growth 
pressure were also observed, at 2 and 3 months, respectively (Extended 
Data Figs 6d, e, f and Supplementary Information 9), with no loss of 
the Apc allele in wild-type mice (Supplementary Information 9). 
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Figure 4 | Mechanical activation of the Ret/B-catenin/Myc pathway in non- 
hyperproliferative Notch-negative crypts of Notch/Apc mice. 

a, Phosphorylation of Ret Y1062 (pRet) in Notch/Apc and overexpression of 
Myc in non-tumorous Notch-negative crypts (GFP negative, white boxes), 
compared to control. Tam., tamoxifen. Scale bars are 10 ttm. b, Quantitative 
results of a. ***P < 0.001, Student’s t-test. Phosphorylation of Ret Y1062: 
control (measured in n = 4 mice, 278 crypts analysed) and Notch/Apc 
(measured in n = 4 mice, 105 crypts analysed). Myc expression: control 
(measured in m = 2 mice, 200 crypts analysed) and Notch/Apc (measured in 
n = 2 mice, 150 crypts analysed). % indicates % of positive crypts. 
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Finally, we tested whether such a mechanotransduction pathway 
was activated by the endogenous physical pressure induced by hyper- 
proliferative effects of Notch-activated crypts on Notch-negative 
crypts, in Notch/Apc mutant mice colon. 

First, we found that both Ret and f-catenin were widely phosphory- 
lated in aberrant Notch/Apc mutant colonic crypts. Enrichment in 
cytoplasmic and nuclear B-catenin were also observed 1 month upon 
Notch activation as well as Myc overexpression after 2.5 months 
(Extended Data Figs 8a, b, 4f and Supplementary Information 10). 

Specifically, Ret phosphorylation, B-catenin phosphorylation and 
nuclear translocation, as well as Myc overexpression were observed 
not only in Notch-expressing crypts, but also frequently in non- 
tumorous crypts lacking Notch activation (Fig. 4 and Extended Data 
Fig. 8a, b). A substantial number of these crypts, which activates Ret, 
are surrounded by Notch negative crypts (Extended Data Fig. 9a). This 
is in agreement with long-range non cell-autonomous mechanical 
induction in healthy tissue by the strain developed by distant Notch 
hyperproliferative domains (Supplementary Information 11 with 
Extended Data Fig. 9a, b, c, d). No exonic mutations were detected 
in Ret in Notch/Apc mice epithelia or in UML + magnet Apc*/1°8 
mice, 3 months after tamoxifen injection and magnetic pressure, 
respectively, as well as in the wild-type copy of Ape*/’®®% (not 
shown). 

Altogether, our results show the mechanical activation of the 
tumorigenic Ret/B-catenin pathway in response to the pressure char- 
acteristic of hyperproliferative tumour growth pressure in colon epi- 
thelia in both Apc*/'®*8N and wild-type mice. We found that such a 
mechanotransduction process also takes place in non-tumorous 
Apc*''®5N heterozygous crypt cells compressed by Notch* hyperpro- 
liferative crypts of Notch/Apc mice (Supplementary Discussion 1). 
Based on these findings, we postulate that mechanical stimulation of 
tumorigenic pathways could potentially occur in normal tissues neigh- 
bouring any type of tumour, contributing to an unstable positive feed- 
back loop between oncogene expression and tumour induction, 
enhancing the growth and spread of the tumour (Supplementary 
Discussions 1, 2, 3 and Extended Data Fig. 10 a-c). 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Detailed protocols are deposited in Protocol Exchange, 
http://dx.doi.org/10.1038/protex.2014.052. 

Transgenic mice and tamoxifen administration. All mice used in this work have 
been previously described: Apc*/*N (ref. 18), VilCreERT2/Nic and 
VilCreERT2/Nic;Ape*! 1638N (ref. 7), including males and females. C57B1/6 inbred 
mice were obtained from Charles River Laboratories and females were used as 
wild-type controls. 

For the experiments on VilCreERT2/Nic;Apc mice, four-week-old mice 
were injected intraperitoneally with tamoxifen (ICN) (50 pg g of animal body 
weight) for five consecutive days. All animal experiments conformed to NIH 
guidelines for the care and use of animals and were approved by the local ethics 
committee (Approval no. 2012-0016). In this mouse model, a tamoxifen-inducible 
Cre recombinase (CreERT2) expressed under the control of the intestinal-specific 
villin promoter induces the expression of an intracellular and constitutively active 
form of the mitotic Notch 1 receptor (N1IC), along with a nuclear GFP reporter. 
As the VilCreERT2 mice are mosaic, tamoxifen injection results in clones of 
mitotic cells expressing N1IC (N marked by GFP) within Apc*/"©"% heterozygous 
colonic cells. 

Synthesis of magnetic nanoparticles. The aqueous suspension of magnetic nano- 
particles was prepared using alkaline co-precipitation of FeCl, (0.9 mol) and FeCl, 
(1.5 mol) salts, according to Massart’s procedure’. Superparamagnetic maghe- 
mite grains (y-Fe,O3) were obtained by oxidizing 1.3 mol of magnetite with 1.3 
mol of iron nitrate (boiling solution). After magnetic decantation, 2 | of distilled 
water and 360 ml of HNO; 20% were added to the solution and the mixture was 
stirred for 10 min. Prepared maghemite nanoparticles were washed several times 
with acetone (3 X 11) and ether (2 X 500 ml) and suspended in water. Size sorting 
was performed by adding HNO; (0.45 M) to the suspension followed by magnetic 
decantation. This operation was repeated with the deposit until a suitable particle 
size was obtained. Sodium citrate (1p,./ncit = 0.13, molar ratio) was added to the 
nanoparticles and the mixture was heated at 80 °C for 30 min to promote absorp- 
tion of citrate anions onto their surface. Citrated nanoparticles were precipitated in 
acetone and suspended in water. The volume fraction and average size of the 
maghemite grains were determined by fitting the magnetization curve of nano- 
particles using Langevin’s Law. Particles of 7.7 nm (standard deviation o = 0.37) 
and of 9 nm diameter (standard deviation ¢ = 0.35, volume fraction of nanopar- 
ticles in the suspension g = 1.9%, specific susceptibility 7/# of 15.5) were obtained 
and used for magnetic liposomes and ultra-magnetic liposomes (UMLs), respect- 
ively. The aqueous medium was removed using a Macrosep advance centrifugal 
device (PALL) and nanoparticles were suspended again in a buffer (0.108 M NaCl, 
0.02 M sodium citrate and 0.01 M HEPES, pH = 7.4). 

Preparation of ultra-magnetic liposomes (UMLs). Solutions of 1,2-dipalmitoyl- 
sn-glycero-3-phosphocholine (DPPC), 1,2-distearoyl-sn-glycero-3-phosphocho- 
line (DSPC), 1,2-distearoyl-sn-glycero-3-phosphoethanolamine-n-[(carboxy(po- 
lyethyleneglycol)2000](ammonium salt) © (DSPE-PEG2000) and L-a- 
phosphatidylethanolamine-N-(lissamine rhodamine sulfonyl B) (ammonium 
salt) (rhodamine-PE) in chloroform were purchased from Avanti Polar Lipids, 
Inc. UML were prepared by the reverse phase evaporation method established in 
ref. 20 and modified according to a previously described protocol”. In brief, a 
mixture of DPPC/DSPC/Rhod-PE/DSPE-PEG 099 (84/10/1/5 mol%, 315 ll) was 
dissolved in 3 ml of diethyl ether (VWR) and 900 tl of chloroform (Carlo Erba 
reagents). Thereafter, 1 ml of citrated magnetic nanoparticles dispersed in the 
buffer was introduced before sonication at room temperature for 20 min to pro- 
duce a water-in-oil emulsion. Organic solvent was evaporated with a rotavapour 
R-210 (Buchi) at 25 °C until the gel phase disappeared. Liposomes were filtrated 
through a 450 nm filter and purified from non-encapsulated magnetic nanopar- 
ticles by magnetic sorting using a strong magnet (Calamit Fe-Nd—B 150 x 100 x 
25 mm). The operation was repeated three times every 2 h and liposomes (highly 
concentrated at 30% of nanoparticles in volume) were finally separated from the 
supernatant and recovered. 

Preparation of magnetic liposomes (MLs). Magnetic liposomes were prepared 
according to a procedure already described’**’. In brief, the magnetic fluid used to 
load the liposomes consisted in an aqueous suspension of 8-nm nanocrystals of 
maghemite (y-Fe,O;) synthesized according to the procedure described above. In 
these conditions, superparamagnetic grains (from magnetization curve”) were 
produced. Final adjustment of both aqueous medium and maghemite concentra- 
tion (5.4 + 0.1 M Fe(III), checked by flame spectrometry) was performed by 
ultrafiltration (MACROSEP filter, cut-off 50 kD, Fisher Scientific Labosi) followed 
by addition of the same buffer as that used for liposome preparation (0.108 M 
NaCl, 0.02 M sodium citrate and 0.01 M HEPES, pH = 7.4). Final magnetic fluid 
was composed of nanoparticles of 20.3 nm in hydrodynamic diameter (0.1 poly- 
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dispersity index from QELS measurements) and stabilized by adsorption of citrate 
anions onto their surface. 

Rhodamine-labelled magnetic-fluid-loaded liposomes (MFLs) were prepared 
by hydration of a thin lipid film (EPC:DSPE-PEGo99:rhodamine-PE; 94:5:1 
mol%, determined by weight, precision of 5.10” * g) with equal volumes of mag- 
netic fluid and buffer to get a total lipid concentration of 20 mM (checked by 
fluorescence spectroscopy). Hydration was followed by sequential extrusion under 
nitrogen pressure (<10 bars) at 25°C through polycarbonate filters with decreas- 
ing pore diameters of 0.8 j1m/0.4 m/0.2 1m (PORETICS, Osmotics, Livermore, 
USA). Liposomes of 226 + 38 nm in hydrodynamic diameter (from QELS mea- 
surements) were recovered. Non-entrapped maghemite particles were separated 
by gel exclusion chromatography (GEC) performed with a 0.4 cm X 5.8 cm 
Sephacryl $1000 superfine (Pharmacia) microcolumn (TERUMO 1 mL-syringe) 
beforehand saturated with EPC: DSPE-PEG 999 (95:5 mol%) liposomes, prepared 
similarly to MFLs but without maghemite. The eluent was the buffer used for 
liposome preparation. Final Iron content of the preparation was checked by ESR 
analysis and found equal to 44 + 1 mM [Fe(IID], that is, 2.2 iron to lipid molar 
ratio (0.7% in volume concentration of nanoparticles). 

Quasi-elastic light scattering (QELS). Hydrodynamic diameters were deter- 
mined by using a photon correlator spectrometer (10 mW HeNe 632.8-nm laser, 
Zetasizer Nano ZS90, Malvern Instruments) at T = 25 °C and 90° scattering angle 
from the measured translational diffusion coefficients of the particles according to 
the Stokes-Einstein law for non-interacting spherical particles. Just before mea- 
surements, the samples were diluted with buffer to optimize the response of the 
apparatus. Unimodal distribution analysis was used and measurements were per- 
formed in triplicate. 

Injection of UMLs and MLs. UMLs and MLs were injected intravenously in 
Apc and wild-type mice at 3-4 months (Extended Data Fig. 2a). The 
essential prerequisite of the experiments was to develop a methodology for intro- 
ducing magnetic particles into the mesenchymal tissue of the colon at the intra- 
cellular level and at concentrations sufficiently high to allow subsequent magnetic 
manipulation. Furthermore, the mode of delivery of the magnetic material must 
preserve the tissue from any other stress, which would become competitive with 
the mechanic constraint generated magnetically. Thus the delivery must be rea- 
lized through an indirect route of administration. Systemic delivery via intraven- 
ous injection performed outside the region of the colon ranks among the best ways 
provided that the pharmacokinetics of the magnetic particles is optimized to 
reduce first-pass hepatic clearance and permit observable distribution in the colon 
tissue. In this respect, we used submicron liposomes sterically stabilized by poly- 
ethylene glycol (PEG) coating as bioavailibity-enhancing carrier of the magnetic 
particles. Indeed, PEGylated magnetic-fluid-loaded liposomes not exceeding 200 
nm in diameter have reliably been proved to be long-circulating systems as intact 
vesicle structures without leakage of their inner content, therefore aptly averting 
dilution of the magnetic material” *. Moreover they have shown to diffuse from 
the vasculature into the interstitial tissues without loss of structure integrity and at 
the intracellular level as well’*”*. Said otherwise, the containment of the magnetic 
particles required for magnetic manipulation and beforehand adjusted within the 
liposomes is totally conserved during their passage through the vascular endothe- 
lium towards the surrounding tissues and upon cellular uptake. 

Mechanical deformations. Ex vivo global compression. To apply an ex vivo mech- 
anical compression colon samples were treated as previously described with a 
confinement box of 3 mm thickness for 0.8 kPa pressure applied’®. For kinase 
inhibition experiments, PP1, PP2 (30 1M; BioSource), SU6656 (20 1M; Sigma- 
Aldrich), sunitinib (20 |1M; Sigma-Aldrich), vandetanib (10 11M; Selleck Chemicals), 
ponatinib (100 nM; Selleck Chemicals), or an equivalent volume of the vehicle 
DMSO (diluted to 1 in 500), was added to reduced serum medium 20 min prior 
starting the compression. 

In vivo magnetic deformation. Mice were anaesthetized with isofluorane (delivered 
at 2% for maintenance and 1.5% for induction in oxygen) and a 3 mm diameter 
strong magnetic field gradient magnet positioned subcutaneously on the back of 
the mice in front of the colon (Extended Data Fig. 2a). The fluorescent UML 
(associated with rhodamine-PE phospholipids) were diluted in a 10 mM 
HEPES pH 7.4, 20 mM Na;Citrate, 108 mM NaCl buffer solution to a final 
concentration of 0.1 M and injected in the lateral caudal veins of the tail. The only 
rare side-effect of injection was a small hematoma that disappeared in two to three 
days later. Magnet implantation caused only local skin inflammation that was 
treated with antiseptics and disappeared in three to four days. No effect on intest- 
inal transit was detected nor dysfunction of the colon, which was largely isolated 
from skin. 

Pharmacological inhibitors. The Src family inhibitors used in this study are PP1 
and PP2”, and SU6656 as a specific inhibitor of Src**. The inhibitors of Ret used in 
this study are sunitinib”’, vandetanib*® and ponatinib*’. 
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Live tissue imaging and analysis. The distal colon was dissected from adult mice, 
rinsed with PBS and incubated at 37 °C in Leibovitz’s L-15 medium + Glutamax-1 
supplemented with 10% fetal bovine serum and 40 pg ml gentamicin. Individual 
tissue segments were opened longitudinally and placed villi up between two cover- 
slides. Images were obtained with an inverted Olympus IX70-Ropper spinning 
disc microscope coupled to a Cool Snap HQ2 camera. 

Iron concentration analyses. Iron concentration analyses were realized by elec- 
tron spin resonance (ESR) analysis. UML and magnetic liposome biodistribution 
was assessed by determining the amount of iron oxide in excised tissues using ESR 
spectrometry. ESR spectra were acquired using an Elexsys E500 ESR spectrometer 
(Bruker Biospin) operated at: resonant frequency ~ 9.2 GHz; microwave power = 
10.11 mW, receiver gain = 60 db and magnetic field modulation amplitude = 
10G, at room temperature. After the sacrifice of the mouse, distal colon samples 
were lyophilized for 24 h. Maghemite nanocrystal suspensions used for each 
liposome sample and of known iron concentration were used as standards to 
construct a calibration curve (R* = 0.99) for quantitative purposes. ESR spectra 
were collected as the first derivative of the absorbed microwave power versus the 
magnetic field. A double integration was applied to obtain the area under the curve 
of the absorption-field curve, which is proportional to the number of resonating 
electronic spins in the measured sample. All tissue samples studied were corrected 
for background tissue absorption of the microwave radiation using tissue samples 
of control mice not exposed to UML. All the double integrations were calculated 
over the field interval (1500-5500 Gauss). Iron amounts were normalized by tissue 
sample weights (mol of Fe per mg tissue). Iron concentration in Apet 1°58 at 1 
week was confirmed to be on the same order of magnitude than at 1 week in the 
wild type before being quantitatively evaluated for each time point conditions 
based on the quantitative comparison of the mean density of fluorescence assoc- 
iated to UML between each Apc‘! sample, and the mean fluorescence value of 
the wild-type samples (using ImageJ), based on ten immunohistological samples 
per mouse. 

Acoustic analysis. A distal colon explant was completely embedded in an agar- 
gelatin phantom. The ultrasonic probe was fixed on one side of the phantom anda 
small magnet approached axially towards the colon by a support. 

To measure the axial deformation (along y direction), strain imaging is per- 
formed, by acquiring an ultrasonic image before and after the magnet was dis- 
placed. The magnetic-force-induced tissue strain, which was measured by 
estimating local tissue displacement along the y direction from the first acquisition 
to the second one. Note that the time between moving the magnet and acquiring 
the data is less than 5 s, which is short enough to avoid creep behaviour. Average 
values were obtained from at least four measurements for each condition. 

Shear wave elastography (SWE) was performed to image quantitative tissue 
stiffness. A remote and tiny palpation (tens of tm) is induced by the acoustic 
radiation force of an initial focused ultrasonic beam and this radiation force 
generates a shear wave propagating along the x direction. Tissue displacements 
induced by this shear wave are mapped by performing ultrafast ultrasonic 
imaging. The local shear wave speed is linked to stiffness. 

Dissected colon samples were embedded in agar-gelatin phantoms (2% agar 
with 5% gelatin)**. Ultrasound images, so called B-mode and elasticity images were 
acquired using a high frequency ultrasound probe (15 MHz, 256 elements, 
Vermon) driven by an ultrafast imaging device (Aixplorer, Supersonic 
Imagine)**. SWE, that is, Young’s modulus (£) quantitative imaging, was per- 
formed by using the supersonic shear wave imaging (SSI) technique ex vivo and 
in vivo?”*, 

The SSI technique is based on the ultrafast ultrasound imaging of a shear wave 
induced by the radiation force of an initial focused ultrasonic beam acting as a remote 
palpation in tissues (Extended Data Fig. 2e). Under the assumptions of a local 
constant density p and a locally incompressible and isotropic elastic medium, the 
propagation speed v, of the tracked shear wave is directly linked to the Young’s 
modulus E (in kPa) characterizing the local stiffness via the relationship: E = 3p X ve. 

The assumption of constant density of in vivo soft biological tissues here is valid. 
Indeed, in biological soft tissues, the density is almost constant p ~ 1,000 kg m? 
due to the very high water content of soft tissues. However, small density variations 
exist as shown in extensive past studies*”**. From these studies, the mean density 
(among connective tissues, muscle, fat, blood cells, plasma, cornea, spinal cord, 
spleen, testis) is 1,052 + 47 kg m °. Thus, the normalized standard deviation in 
soft tissues is 4.7%. 

Also, due to their high fluid content, many soft biological tissues and gels exhibit 
nearly incompressible behaviour under physiological loading; they are constrained 
to undergo essentially volume-preserving deformations and motions. Thus, it is 
commonly accepted in the field of tissue elasticity measurements that the Poisson’s 
ratio of tissue has a value between 0.49 and 0.4999, meaning that tissue is nearly 
incompressible. Of course, this incompressible behaviour is only ensured pro- 
vided that the conditions of interest do not allow the water to diffuse into or out of 
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the tissue during the period of interest. This is the case in the SWE approach, due to 
the very small (micrometric) displacements induced by the shear wave used to 
probe local elasticity. Such tiny displacements do not induce water diffusion 
outside of the organ. 

Although the assumption of local isotropy has to be done to derive the Young’s 
modulus from the shear wave speed measurements, it is not possible currently to 
prove its validity in vivo. In particular tissue such as muscles, an elastic anisotropy 
has already been proven in vivo*'**. However, the in vivo assessment of such 
anisotropic elasticity was only made possible in particular configurations, such 
as the human biceps, because the structural organization and orientation of the 
muscular fibre bundles is highly identical over a large region of interest. Other 
tissues like breast, arteries, liver are today assumed as isotropic in the field of 
elastography. Even if a local anisotropy could potentially exist in these tissues, 
we postulate that it should remain quite small. Indeed, it was recently shown that 
the anisotropy of shear modulus in the in vivo kidney was quite small with a 
fractional anisotropy of cortex and medulla <20% (see Fig. 3. of ref. 43) despite 
a more important tissue organization in the kidney (due to the alignment of the 
pyramids) than in the other tissues. 

Extensive calibration experiments were performed in the past to demonstrate 
the ability of SWE to quantify the Young’s modulus of tissues. The standard 
deviation of Young’s modulus quantification was demonstrated to be lower than 
5% on calibrated phantoms mimicking biological tissue properties***. A small 
magnet (3 mm in diameter) was axially approached towards the colon by steps of 
0.5 mm until completing 3 mm of absolute axial displacement. For each position of 
the magnet, strain images (¢) were calculated by comparing raw frequency ultra- 
sound images acquired at two consecutive steps**. Cumulative one-dimensional 
strain along the y direction was obtained by summing all strain images (Extended 
Data Fig. 2f). Although the magnet could induce some stress in the full volume in 
the three directions of space, it is here considered that the strain in the lateral (x) 
and elevational (z) directions remain small compared to the measured axial (y) 
strain. Under this assumption of a force that is in majority in the axial y direction, 
the quantitative stress S applied by the magnetic field acting on ferrofluids trapped 
in colon tissues was retrieved by calculating the one-dimensional Hooke’s law: S = 
E X « (ref. 47). Using Hooke’s law, the standard deviation of S was experimentally 
estimated and found to be equal to 0.61 kPa. It is in good agreement with a 0.74 kPa 
standard deviation of S derived from equation (1) that describes the influence of E 
and ¢ uncertainties (35.0 + 3.0 kPa and 4.3 + 2.1% ,respectively) on the uncer- 
tainty of S: 


std(S) = \/var(E x ¢) = i/sta(e)? x std(e)? + std(E)? x e2 +std(e)?  E (1) 


During the experiments, collection was performed no more than 4 s after the 
compression induced by the magnet motion. Such a small delay ensures that 
one can avoid any creep behaviour. Indeed, the relaxation time for typical human 
tissues under compression is of the order of several tens to hundreds of seconds”. 
The distance between the colon and the magnet was evaluated using ultrasound 
imaging in vivo, with anal injection of a water gel transparent to ultrasound in the 
colon lumen to visualize the colon. 
Western blot. Isolation of epithelial cells of the colon was performed by incuba- 
tion of tissue explants in 3 mM EDTA and 0.5 mM DTT in phosphate-buffered 
saline (PBS) for 45 min. Cell pellets were washed in PBS and lysed in radioimmune 
precipitation assay buffer (50 mM Tris-Cl (pH 8.0), 150 mM NaCl, 1% Nonidet 
P-40, 0.5% deoxycholate and 0.05% SDS) plus 1% phosphatase inhibitor cocktail, 
and 1% protease inhibitors. The protein content of the supernatant was measured 
by the colorimetric reaction RD DC protein assay (Bio-Rad). An equivalent quant- 
ity of protein (20 1g) was resolved by SDS-PAGE, transferred to a nitrocellulose 
membrane (Invitrogen), and hybridized with the appropriate antibodies, followed 
by detection using enhanced chemiluminescence (Pierce ECL Plus Western 
Blotting Substrate, Thermo Scientific). Protein levels were quantified using 
Image]. The following antibodies were used: anti-pY 1062 Ret (1:100, Santa Cruz 
Biotechnology), anti-GAPDH (1:5,000, abcam), and horseradish peroxidase 
(HRP)-conjugated secondary antibodies (1:1,000, Jackson ImmunoResearch). 
RNA isolation and qRT-PCR. Total RNA isolation was performed by using 
NucleoSpin RNAII (Macherey-Nagel), according to the manufacturer’s instruc- 
tions and quantified with a NanoDrop ND-1000 spectrophotometer. First-strand 
cDNA was synthesized by using AccuScript High Fidelity 1st Strand cDNA 
Synthesis kit (Agilent Technologies). For qPCR, reactions were run on a real-time 
PCR system (ABI Prism 7900; Applied Biosystems) and gene expression was 
detected with Power SYBR Green (Applied Biosystems). Relative gene expression 
was determined by normalizing to reference genes §2-microglobulin and TATA 
box binding protein TFIID (transcription factor IID), using the comparative 
threshold cycle (Cr) method”. Results were expressed as fold change of each 
sample versus control. The primers used in each reaction were as follows: Myc 
forward 5'-TCAAGAGGCGAACACACAAC-3’ and reverse 5'-GGCCTTTT 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


CATTGTTTTCCA-3’; Axin2 forward 5'-CGCCACCAAGACCTACATACG-3’ 
and reverse 5’-ACATGACCGAGCCGATCTGT-3’; Zebl forward 5/- 
TGGCAAGACAACGTGAAAGA-3’ and reverse 5’-AACTGGGAAAATGCAT 
CTGG-3’; TFII forward 5'-CCACGGACAACTGCGT-3’ and reverse 5’-GGCT 
CATAGCTACTGA-3’; §2-microglobulin forward 5’-GCTATCCAGAAAA 
CCCCTCAA-3’ and reverse 5’-AGGCGGGTGGAACTGTGTT-3’. Statistical 
data analyses were performed using a two-tailed unpaired Student's t-test between 
any two groups (n = 7). 

Accession numbers for gene expression data. The UniProt accession numbers 
for sequences used in the gene expression analyses are: Axin2, 088566; Myc, 
P01108; and Zeb1, Q64318. 

Next-generation sequencing. Targeted re-sequencing of Apc and Ret gene panel 
was designed with Ion ampliseq designer v3.2. Multiplex PCR primers were 
selected to amplify the coding sequences of the selected genes: Apc (98,39% of 
the gene coding sequence covered by the design), Ctnnb1 (100%), Ret (99,8%), 
Gsk3b (100%) and Axin2 (96,26%) with an exon padding of 25 bp. Sequencing 
libraries were built with the Ion Ampliseq Library Kit v2.0 following manufac- 
turer’s recommendations (Life Technologies). In brief, 10 ng of each DNA sample 
was mixed with designed primer pools and supplied PCR mix to amplify targeted 
genomic regions. PCR products corresponding to PCR primers were partially 
digested and barcoded Ion adapters were ligated at the 5’ and 3’ extremities of 
digested PCR products. After purification of ligation products, PCR amplification 
followed by quality control was performed. Generated sequencing libraries were 
then pooled in equimolar ratio and sequencing templates were prepared on an Ion 
One Touch system with the Ion OneTouch 200 Template Kit v2 DL (Life 
Technologies). After selective enrichment and loading on a Ion 318v2 chip, pos- 
itive sequencing templates were sequenced on a Ion Torrent PGM with the Ion 
PGM Sequencing Kit 200 v2 (Life Technologies). Three technical replicates for 
each mouse DNA sample were processed in parallel to determine the technical 
variability to accurately evaluate the copy number variation on the targeted genes. 
Bioinformatics analysis. Sequenced reads were aligned on the mm10 reference 
genome using the TMAP software (v3.6.2, Life Technologies). Around 95% of the 
reads were aligned on the targeted genes with a mean depth of coverage of 1750. 
The variant calling was performed using the Ion Torrent Variant Caller (v3.6.2, 
Life Technologies) and annotated with the RefSeq genes and dbsnp138 databases 
using the Annovar software”. Variants found with a low coverage (=30X) or with 
a strand bias were filtered out. 

Analysis of Apcstatus by genotyping. Isolation of epithelial cells of the colon was 
performed by incubation of tissue explants in 3 mM EDTA and 0.5 mM DTT in 
PBS for 45 min. The supernatant was then discarded, 10ml of cold PBS added, and 
the mixture was shaken vigorously by hand before being spun for 20 min at 153g at 
4 °C to obtain the cell pellet. Total DNA isolation was performed by using QlAamp 
DNA mini kit (Qiagen), according to the manufacturer’s instructions. For PCR, 
gene expression was detected with BioTaq Red DNA Polymerase (Abcys). For Apc 
genotyping, the following primers were used: primer PA2, TCAGCCATG 
CCAACAAAGTCA, primer PN3, GCCAGCTCATTCCTCCACTC and primer 
C2, GGAAAAGTTTATAGGT. Cycling conditions were 5 min at 94 °C (1 cycle); 
30 s at 94 °C, 45 s at 59 °C, and 345 s at 72 °C (35 cycles); and 5 min at 72 °C (1 
cycle). The presence of the wild-type allele is indicated by a 300-bp PCR fragment 
(primers PA2 and C2) and the mutant allele by a 400-bp PCR fragment (primers 
PA2 and PN3). 

Analysis of Apc status by DNA sequencing. Isolation of epithelial cells of the 
colon was performed by incubation of tissue explants in 3mM EDTA and 0.5mM 
DTT in phosphate-buffered saline (PBS) for 45 min. The supernatant was then 
discarded, 10ml of cold PBS added, and the mixture was shaken vigorously by 
hand before being spun for 20 min at 153g. at 4 °C to obtain the cell pellet. DNA 
was isolated according to the QlAamp DNA mini kit (Qiagen) in order to be 
analysed by the Sequencing Platform of the Institut Curie. A dedicated sequencing 
panel targeting the coding sequence of the Apc and Ret genes was used. Only one 
variant (D2086G) was detected in the Apc gene in one control mouse and one 
mouse subject to mechanical pressure. This single nucleotide change is referenced 
as a common polymorphism in the dbSNP database (rs47505115). No Y1062 
mutation was detected in the Ret gene. No exonic mutation was detected in the 
Ret gene (not shown). 

Histology and immunohistochemistry. Freshly dissected colon samples were 
treated as previously described'®. The commercial antibodies used were: anti- 
Myc N262 (1:50, Santa Cruz Biotechnology), Twist H81 (1:50, Santa Cruz 
Biotechnology), B-catenin (1:100, BD), pY654 B-catenin (1:50, abcam), pY1062 
Ret (1:50, Santa Cruz Biotechnology), Zeb1 (1:50, Santa Cruz Biotechnology), 
vimentin (1:300, Sigma), Ki67 (1:200, abcam), pS9 GSK-3 (1:50, Cell Signalling), 
pY568/570 Kit (1:100, Santa Cruz Biotechnology), pY412 Abl (1:50, Santa Cruz 
Biotechnology), pY209/211 Hck (1:50, Sigma), pY537 Yes (1:100, Santa Cruz 
Biotechnology), pY1238/1239 Ron (1:50, Santa Cruz Biotechnology), pSer473 


Akt (1:50, Santa Cruz Biotechnology), E-cadherin (1:100, Santa Cruz 
Biotechnology), villin (1:100, Santa Cruz Biotechnology), GDNF (1:50, abcam), 
neurturin (1:200, abcam), artemin (1:200, abcam), anti-mouse Cy3 (1:500, Jackson 
ImmunoResearch), anti-rabbit Alexa 568 (1:600, Molecular Probes), anti-rabbit 
Alexa 350 (1:200, Molecular Probes) and anti-rabbit Alexa 488 (1:200, Molecular 
Probes). All images were taken with a Zeiss LSM710 microscope at the Platform 
for Cell and Tissue Imaging of the Institut Curie. 

Nuclear f-cat translocation was evidenced using the co-localization 
highlighter function of the co-localization analysis ImageJ plugin 
(http://www.uhnresearch.ca/facilities/wcif/imagej/colour_analysis.htm). Co- 
localization was coded in a binary way as white pixels (positive for co-localization) 
and quantified by measuring the density of white pixels, with control parameters 
adjusted at the threshold showing no co-localization in controls. 

Statistics. All quantitative results were analysed using the non-parametric Mann- 
Whitney exact test and Student’s t-test, both two-tailed, with standard deviation 
error bars being comparable within each group of data. 

All experiments were reproduced at least twice in the laboratory. n is the 
number of measurements by experiment. 

No randomization was performed. No blinding was performed. No statistical 
method or power analysis was used to predetermine sample size. The coordination 
of the distinct disciplinary aspects of the work (acoustic, magnetism, magnetic and 
tamoxifen injections, surgery, labelling, qPCR, etc.) between the ten groups 
involved was complex, and did not materially allow us to organize such proce- 
dures. 
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Extended Data Figure 1 | Mechanical characterisation of pressure and 
stiffness in colonic tumours in Notch/Apc mice. a, Crypt strain deformation 
analysis. Imaging of tumour initiation in wide-open colon samples. Time- 
dependent elliptical crypt shape change in Notch/Apc mice tissue explants 
analysed at 0, 2 weeks and 1 month after tamoxifen injection initiating mitotic 
Notch expression (Methods) (measured in n = 2 mice per time point). Notch 
expression is revealed by GFP. Crypt strain deformation ¢ = L1/((L1+L2)/2) 
(L1 being the smallest radius, and L2 the largest radius of the crypts elliptically 
deformed by the Notch dependent hyperproliferation tumour pressure 
induced), analysis using ImageJ. Eoweer = 0.93 + 0.05 (measured in n = 20 
crypts, two mice) for control sample at 0 weeks, é2weeks = 0.874 + 0.06 
(measured in m = 31 crypts, two mice, P = 0.029) at 2 weeks, and month = 
0.79 + 0.11 (measured in n = 20 crypts, two mice, P < 0.0001), at 1 month in 
colon explants. Statistic test is the non-parametric exact Mann-Whitney test. 
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Error bars are standard deviation. Scale bars, 5 um. b, Stiffness (E) of distal 
colon explants embedded in agar-gelatin phantoms (Methods) measured by 
using shear wave elastography (see Extended Data Fig. 2e). Average values of E 
were obtained from at least n = 6 measurements (in two mice for 0 weeks, 1 
month and 2 months, three mice for 3 months). ¢, In vivo stiffness images of 
wild-type abdomen, and of Notch/Apc tumorous colon after 3 months of 
tumour development (E = 70.8 + 2.1 kPa). Elasticity measurements were taken 
in vivo over anaesthetized mice (average of at least n = 6 measurements, in two 
mice in each condition). d, Acoustic imaging distance of the magnet to the 
colon. The distance between the colon, observed in black after injection of 
echographic water gel in the lumen (Methods), and the round 3 mm diameter 
magnet, visualized as a result of acoustic refraction, was measured to be d = 7 + 
1 mm (measured in n = 4 mice). The white bar is 1 mm. 
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Extended Data Figure 2 | Magnetic loading, iron oxide quantification by 
electronic spin resonance and strain compression calculation of magnetized 
wild-type and Apc*/?©*N colon distal samples. a, Magnetizing mouse colon. 
Subcutaneous insertion of the 3 mm diameter magnet (in red) on the back of 
the mice 7 mm in front of the distal colon, following UML injection in the 
lateral caudal vein of the mouse tail. Acoustic elastography probe in green. 

b, UML detection and localization in the colon. Rhodamine-labelled UML 
observed 1 week after injection in wild-type colon explants, in contrast to non- 
injected controls (measured in n = 10 images, in six mice for each condition). 
c, Iron oxide quantification by electronic spin resonance in wild-type mice, 
injected with UML or magnetic liposomes loaded with an equivalent of 120 
moles and 2.2 moles of Fe(III) per mole of lipids, respectively. Iron oxide 
concentration in the distal colon was measured by electronic spin resonance 
(5% precision) from lyophilized colon explants at 1 week (magnetic liposomes: 
79 + 12 nmole g ‘| (measured in n = 6 mice); UML: 183 + 48 nmole g 
(measured in n = 3 mice), 2 weeks (271 + 64 nmole gt (measured in n = 3 
mice)) and 1 month (299 + 157 nmole g (measured in n = 3 mice)) after 
administration. Control sample was not injected with UML and was used to set 
up background of iron oxide concentration. A Mann-Whitney—Wilcoxon test 
concluded no significant difference in iron oxide concentration during 1 month 
(P = 0.2) in the case of the mice injected with UML. This supports the 
maintaining of the number and local containment of the delivered magnetic 
particles into the distal colon over this period of time, in agreement with the 
very slow biotransformation of such nanomagnets recently revealed in vivo". 
The iron oxide concentration at 1 week in the distal colon of wild-type mice 
injected with magnetic liposomes was found around twice lower compared to 
samples injected with UML while the residual content tends to vanish from 2 
weeks (results not shown). d, Magnetized wild-type colon samples injected with 


magnetic liposomes show no difference of local strain compression compared 
to non-injected samples or injected with magnetic liposomes without magnet 
(measured in n = 2 for each condition). e, Schematic representation of the 
ultrasound measurement setup and biomechanical imaging techniques. Left, 
shear wave elastography. Right, strain imaging. f, Ultrasound imaging of 
magnetically induced deformation of the mouse colon. Representative B-mode 
acoustic image (right) of a magnetically loaded colon explant and the 
corresponding displacement map (left) after magnet moved from 10 mm to 7 
mm towards the colon (Methods). The mean value of displacement within the 
colon and the rest of the phantom were 35 j1m and 16.9 |.m, respectively 
(measured in n = 12 mice). g, Stiffness measurements of control (in 3 mice) and 
magnetized colon explants (in 6 mice) remained on the order of E = 30-35 kPa, 
n = 6 measurements by mice. h, Apc/"®8N mice show poor deformation at 1 
week (mean elastic modulus control E = 24 + 2 kPa, measured in n = 3 mice; 
UML + magnet E = 23 + 2 kPa, measured in n = 4 mice) compared to wild 
type at 1 week (Fig. 1b) with similar iron oxide loading concentration of 275 + 
80 nmole g ' (see i, n = 3) compared to the 183 + 48 nmole g ' of wild type 
(see c). Deformation decreases at 2 weeks compared to 1 week in both the 
Ape*! 1638N (mean elastic modulus control E = 21 + 5 kPa, measured in n = 3 
mice; UML + magnet E = 33 + 2 kPa, measured in n = 2 mice) and wild type 
(mean elastic modulus control E = 30 + 8 kPa, measured in n = 3 mice; UML 
+ magnet E = 35 + 5 kPa, measured in n = 3 mice). i, The mean Fe(III) 
concentration in magnetized Ape*/!©8N colon, of 265 + 80 nmole g ' on one 
month, remains on the order of magnitude of the mean concentration in wild- 
type colons, of 250 + 50 nmole g', meaning that the magnetically induced 
stress remains on the same order of magnitude during 1 month in Ape*/'©*N 
compared to wild-type mice (measured in n = 3 mice for each condition). 
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Extended Data Figure 3 | Mechanical induction of B-catenin nuclear 
translocation, of target genes expression and ACF formation, though 
B-catenin Y654 phosphorylation in Apc*/'®8N mice colon. a, Expression 
levels of Myc, Axin2 and Zeb1 observed by RT-qPCR (measured in n = 8 mice 
for Myc, n = 9 mice for Axin2 and Zeb1). Non-significant P values between 
control and UML-injected conditions, P < 0.01 between UML with magnet 
and UML-injected and control conditions (non-parametric exact Mann- 
Whitney test). Error bars are standard deviation. b, Nuclear B-catenin under 
UML with magnet conditions, compared to controls. White spots and purple 
represent a positive nuclear DAPI and B-catenin co-localization, with a 
preference for the peripheral privileged sites of transcriptionally active 
chromatin (measured in n = 2 mice for each condition, ten images analysed 
with 4-6 crypts per image). Quantification of P values: P > 0.05 for control 


Magnet Magnet injected injected 
1 2 1 2 


month months month months 


versus UML comparisons. P < 0.01 for comparison between UML plus magnet 
versus UML without magnet, by Student's t-test. c, Number of large crypts 
(>1,500 pm”) in Ape*/"®®N mice under UML plus magnet conditions after 1 
and 2 months (all crypts mean surface 1,218.30 + 10.64 and 1,233.8 + 211.08 
um’, 261 crypts, measured in n = 2 mice and 195 crypts, measured in n = 3 
mice, respectively), compared to control without UML (mean surface 927.6 + 
124.72 um?, 545 crypts, measured in n = 3 mice), and in Notch/Apc mice after 
1 month and 2 months (mean surfaces of 1,262.6 + 144.35 and 1,519.5 + 
319.19 uum?, 441 and 178 crypts, respectively, measured in n = 2 mice for each 
condition), compared to non-tamoxifen-injected Ape'/1®*N control (mean 
surface 927.6 + 124.72 jum’, 545 crypts, measured in n = 3 mice). Pyagnetimonth 


< 0.01, Pagnet2months < 0.05, PNotch/Apclmonth < 0.01 and PNotch/Apc2months < 
0.001 (analytic Student’s t-test). Scale bars are 10 jum. 
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Extended Data Figure 4 | Mechanically induced phospho-Tyr654-B-catenin 
by tumour growth pressure do not co-localize with E-cadherin but with 
villin. a, Top, Ki67 expression under UML plus magnet conditions show 
increased number of positive cells per crypt, in contrast to controls (without 
UML, or without magnet with UML, measured in n = 2 mice, 33 crypts, for 
each condition. Blue is DAPI. Bottom, P values are of P < 0.001 between any 
individual conditions. Statistical test is the non-parametric exact Mann- 
Whitney test for all preceding quantitative experiments. b, No co-localization 
of pT yr654-B-catenin with sub-apical junctional E-cadherin in response to 
tumour growth pressure and co-localization with villin, in Apc'/'°°8N mice 
with magnetic pressure (measured in n = 2 mice by condition). c, No co- 
localization of pTyr654-$-catenin with sub-apical junctional E-cadherin in 
response to tumour growth pressure and co-localization with villin, in wild- 
type mice with magnetic pressure (measured in n = 2 mice by condition). 

d, Immunofluorescence analysis of pY1062-Ret kinase in response to ex vivo 
mechanical strain. The phosphorylation of Ret Y1062 is observed in ex vivo 
Apet/1638N colon upon mechanical compression (Methods) (54.26 + 16% 
positive crypts out of 236, measured in n = 7 mice), initiating at 1 min 
compression and compared to control (0.6 + 1.3% of positive crypts out of 99, 
measured in n = 5 mice). Mechanical activation of pY1062-Ret is impaired by 
wide range PP1 and PP2 Src-family kinase inhibitors (1.55 + 0.25% positive 
crypts out of 142 and 6.03 + 3.5% positive crypts out of 192, respectively, 
measured in n = 2 mice for each condition). Sunitinib, a Ret inhibitor (9.85 + 
3.8% positive crypts out of 143, measured in n = 4 mice), blocks mechanical 
activation of pY1062-Ret and SU6656, a Src-specific inhibitor (39.5 + 8.5% of 
positive crypts out of 95, measured in n = 3 mice), does not block mechanical 
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activation of pY1062-Ret. e, Ex vivo mechanical induction of pY654-B-catenin 
after 20 min (measured in n = 2 mice) (upper panel), enrichment of 
cytoplasmic and nuclear f-catenin after 20 min (measured in m = 3 mice) 
(middle panel), and Myc expression after 4 h (measured in n = 6 mice) (lower 
panel), impaired by Sunitinib inhibitor treatment (measured in n = 2 mice by 
condition) in Apc*!"®*N mice. B-catenin control: 13.97 + 2.7% positive crypts 
out of 121, measured in n = 2 mice; compressed: 69.82 + 8.7% positive crypts 
out of 119, measured in n = 2 mice; compressed + sunitinib: 24.13 + 1.2% 
positive crypts out of 119, measured in n = 2 mice. Myc control: 10.96 + 0.21% 
positive crypts out of 118, measured in n = 2 mice; compressed: 75.08 + 2.8% 
positive crypts out of 148, measured in n = 6 mice; compressed + sunitinib: 
23.36 + 1.1% positive crypts out of 60, measured in n = 2 mice. Ex vivo 
mechanical compression of Apc*/'®®N mice tissues caused an increase of 
B-catenin-positive nuclei (14.35 + 5 a.u. (arbitrary units), measured in n = 2 
mice) by a factor of 3.6 compared to the control (3.97 + 1.9 u.a, measured in 
n = 2 mice), impaired in the presence of Sunitinib (4.9 + 1.9 a.u., measured in 
n = 2 mice). Merged images of DAPI and -catenin were obtained with Image] 
co-localization analysis on ten images analysed with 4-6 crypts per image. 
White spots represent a positive co-localization. f, No co-localization of 
pTyr654-B-catenin with sub-apical junctional E-cadherin and co-localization 
with villin in Notch/Apc mice (measured in n = 2 mice). Here, the original 
colour for E-cadherin and villin was blue to avoid Notch-GFP green fluorescence, 
and was changed to green for the colour coherence of the complete figure. g, No 
co-localization of pTyr654-B-catenin with sub-apical junctional E-cadherin and 
co-localization with villin, in Ape*/°8N ex vivo compressed mice. Some apical 
E-cadherin is observed (measured in n = 2 mice). 
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Extended Data Figure 5 | Induction of ACF and adenoma formation in 

response to magnetic pressure mimicking tumour growth in vivo observed 
by colonoscopy in Apc cF/1638N mice, a, Apc*! 1638N control: time 0, 0 aberrant 
crypt; t = 5 days, 0 aberrant crypt; t = 15 days, 2 + 1 ACF; t = 1 month, 4 + 1 
ACF; t = 1.5 months, 4 + 1 ACF; t = 2 months, 5 + 2 ACF; t = 2.5 months, 3 + 
2 ACF (measured in n = 3 mice). Ape*! 1638N UML; time 0, 0 aberrant crypt; t= 
5 days, 0 aberrant crypt; f = 15 days, 2 + 1 ACF; t = 1 month, 4 + 1 ACF; t = 
1.5 months, 4 + 3 ACE; t = 2 months, 4 + 2 ACF; t = 2.5 months, 3 + 1 ACF 
(measured in n = 3 mice). Mean size of ACF remained constant and around 
874.5 + 414 jum’ in both cases. Ape’/1°°8N UML + magnet: time 0, 0 aberrant 
crypt; t = 5 days, 2 + 1 ACF of size 1,146.9 + 101 um?; t = 15 days, 8 + 2 ACF 
of size 1,777.7 + 775 um” plus one small adenoma of 4,229.38 jum? size; t = 1 
month, 8 + 2 ACF of mean size 3,483.9 + 665 bum? plus one small adenoma; 
t = 1.5 months, 8 + 2 ACF plus one bigger adenoma of 6,379.92 uum? size; t = 2 
months, 8 + 2 ACF plus one small adenoma of 3,512.54 uum? size plus one 

bigger adenoma; t = 2.5 months, 8 + 2 ACF plus one small adenoma plus one 


bigger adenoma of 6,523.28 tm” size (measured in n = 3 mice). Time 0 
corresponds to 4-month-old animals. b, Histologic characterization of the 
adenoma carcinoma induced by magnetic mechanical pressure mimicking 
tumour growth in the Ape*/"®®N mice using haematoxylin and eosin stainning. 
Ape"! 1638N UML 2.5 months, no ademona carcinoma is observed (measured in 
n = 3 mice); Apc*/'8N UML magnet 2.5 months and zoom, the crypt fusion 
and loss of apico-basal polarity in nucleus position show the carcinoma nature 
of the two adenoma observed (measured in n = 3 mice). All mice were injected 
at 4 months. c, LOH (loss of heterozygosity) evaluation in colonic tissues of 
Ape*’'®8N mice under UML mechanical po by genotyping at 15 days. 
PCR analysis of DNA derived from Apc*’'®*N mice with oligonucleotides 
ers ng the wild-type and mutant Apc’®**N allele. Wild- “type mice show no 

c©8N allele (measured in n = 3 mice), whereas Apc*/'®*®N heterozygous 
ae show a ratio Ape:Apc'©*®% of 1.1 + 0.1 (measured in n = 3 mice). 
Apce*/1®8N mice with UML and magnet show a ratio Ape:Apc'®*8N of 1.1 + 0.1 
(measured in n = 2 mice). 
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Extended Data Figure 6 | Mechanical activation of Ret in both Apc 
and wild-type mice. a, Screening of the mechanical activation of the Src family 
kinases known to phosphorylate -catenin Tyr654 and to be mechanosensitive 
in cell culture. Ret activation by Ret Tyr1062 phosphorylation under uni-axial 
compression (see Extended Data Fig. 4d legend for statistics). No 
phosphorylation of the Tyr568/570 site of activation of Kit under uni-axial 
compression. Control samples showed 6.3 + 1.4% positive crypts (258 total 
crypts observed, measured in n = 4 mice) and compressed samples 6.3 + 1.2% 
positive crypts (399 total crypts observed, measured in n = 6 mice). No 
phosphorylation of the Tyr1238/1239 site of activation of Ron under uni-axial 
compression. All negative crypts both in compressed samples (306 total crypts 
observed, measured in n = 4 mice) and controls (347 total crypts observed, 
measured in n = 4 mice). No phosphorylation of the Tyr537 site of activation of 
Yes. Control samples showed 20.8 + 1.3% positive crypts (307 total crypts 
observed, measured in n = 4 mice) and compressed samples showed 15.7 + 
1.9% positive crypts (312 total crypts observed, measured in n = 4 mice). 
Positive crypts had only one positive stained cell. No significant 
phosphorylation increase of the Tyr209/211 site of activation of Hck under uni- 
axial compression (measured in n = 4 mice for each condition). 89.2 + 6.6% 
positive crypts in compressed samples (268 total crypts observed, measured inn 
= 4 mice) compared to control 88.3 + 2.6% positive crypts (258 total crypts 
observed, measured in n = 4 mice). No phosphorylation of the Tyr412 site of 
activation of Abl under uni-axial compression. All negative crypts both in 
compressed samples (326 total crypts observed, measured in n = 4 mice) and 
controls (332 total crypts observed, measured in n = 4 mice). b, Inhibition of 
pTyr1062- Ret, pT yr654-B-catenin and Myc mechanical induction by the two 
additional inhibitors of Ret, vandetanib and ponatinib. Top, Tyr1062 
phosphorylation of Ret under uni-axial compression (54.16 + 15% positive 
crypts of 236, n = 7 mice), initiating at 1-min compression, compared to 
control (0.6 + 0.8% positive crypts of 99, measured in n = 5 mice) is inhibited 
in the presence of vandetanib (7.91 + 1.2% positive crypts of 128 total crypts 
observed (measured in n = 2 mice)) and ponatinib (5.17 + 1.3% positive crypts 
out of 193 total crypts observed (measured in n = 2 mice)). Middle, initiation of 
the B-catenin oncogenic pathway by phosphorylation of B-catenin on Tyr654 


under uni-axial compression (69.82 + 8.7% positive crypts of 119, measured in 
n = 2 mice) compared to control (13.97 + 2.7% positive crypts of 121, 
measured in n = 2 mice) is inhibited in the presence of vandetanib (5.89 + 4.2% 
positive crypts of 102 total crypts observed, measured in n = 2 mice) and 
ponatinib (9.13 + 3.3% positive crypts of 120 total crypts observed, measured 
in n = 2 mice). Bottom, mechanical induction of Myc under uni-axial 
compression (50.5 + 0.2% positive crypts of 146, measured in n = 2 mice) 
compared to control (11 + 0.2% positive crypts of 118, measured in n = 2 
mice), is altered in the presence of vandetanib (36.7 + 2.2% positive crypts of 
108 total crypts observed, measured in n = 2 mice) and ponatinib (20.2 + 3.5% 
positive crypts of 182 total crypts observed, measured in n = 2 mice). The 
concentrations used were 10 LM for vandetanib, and 100 nM for Ponatinib. 
c, pY1062-Ret induction quantified by western blot. Control (measured in n = 
3 mice), UML + magnet (measured in n = 3 mice) and UML alone (measured 
in n = 3 mice) conditions. P > 0.05 between control and UML, P < 0.01 
between UML and UML + magnet, Student t-test. d, Immunofluorescence 
analysis of wild-type and UML-injected colon in the presence of the magnet of 
pY1062-Ret (at 2 weeks; control: measured in n = 3 mice, 300 crypts minimum; 
UML + magnet: measured in n = 3 mice, 469 crypts), of pY654-B-catenin 
apically (at 1 month; control: measured in n = 3 mice, 798 crypts; UML + 
magnet: measured in n = 2 mice, 495 crypts), of cytoplasmic and nuclear 
enrichment of B-catenin (at 2 months; control: 14 images analysed with 4-6 
crypts per image; UML + magnet: ten images analysed with 4-6 crypts per 
image; measured in n = 2 mice for each condition), and of the expression of 
Myc (at 2 months; control: measured in n = 3 mice, 245 crypts; UML + 
magnet: measured in n = 4 mice, 453 crypts). White spots and purple represent 
positive co-localization, with a preference for peripheral sites. e, Quantification 
of d. P values: pRet (P < 0.001), pB-catenin (P < 0.001), B-cat (P < 0.001) and 
Myc (P < 0.001), by Student’s t-test. f, Percentage of crypts exceeding 1,500 
um? of mean surface area 3 months after UML injection mimicking tumour 
growth pressure (mean surface 1,311.29 + 82 um?, measured in n = 2 mice, 285 
crypts) compared to the control without UML (mean surface 914.3 + 189.09 
um’, measured in n = 3 mice, 432 crypts). Péagnetszmonth < 0.05 (analytic 
Student’s t- test). Scale bars are 10 jum. 
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Extended Data Figure 7 | Ret activation is rapidly and non-cell 
autonomously induced by the mechanical strains developed by magnetic 
pressure. a, Rapid mechanical activation of pTyr1062-Ret initiating 30 min 
after UML injection in the presence of magnet in wild-type and in Ape */18N 
mice. Wild-type control: all negative crypts. Wild type + UML + magnet 30 
min: 20.2 + 0.56% positive crypts (measured in n = 2 mice, 372 crypts), 1 h: 
44.56 + 22.81% positive crypts (measured in n = 2 mice, 377 crypts), 2 h: 61.4 
+ 24.12% positive crypts (measured in n = 2 mice, 458 crypts), 4 h: 67.5 + 
10.23% positive crypts (measured in n = 2 mice, 510 crypts). Ape 1°8N 
control: 1.22 + 0.48% positive crypts (measured in n = 4 mice, 556 crypts). 
Ape*"©8N + UML + magnet 30 min: 20.18 + 3.73% positive crypts 
(measured in m = 2 mice, 417 crypts), 1 h: 28.77 + 7.87% positive crypts 
(measured in n = 2 mice, 382 crypts). b, Maintenance of mechanical activation 
of Tyr1062-Ret 24 h after UML injection in the presence of magnet in 

Ape’ ®8N mice. Ape*©8N control: 1.22 + 0.5% pTyr1062-Ret-positive 
crypts out of 556 total crypts (measured in n = 4 mice). Ape’ '©8N + 
UML (without magnet): 3.88 + 0.6% of pTyr1062-Ret-positive crypts out of 
568 total crypts (measured in n = 2 mice). Apc'/'®N + UML + magnet: 
14.15 + 2.5% of pTyr1062-Ret-positive crypts out of 499 total crypts 
(measured in n = 2 mice). c, Mechanical activation of pTyr1062-Ret kinase 
in non-UML-loaded local domains, after UML injection and magnet 
implantation. A strong phosphorylation of Ret Tyr1062 (green) was 
observed not only in the domains where UML were accumulated (not 
shown) but also in UML-absent domains (negative signal for fluorescent 
rhodamine, left) in Ape*! 1638N colons (measured in n = 4 mice for each 
condition). d, No mechanical activation of pTyr1062-Ret kinase after UML 
injection in the absence of magnet in Apc-deficient colons. In some 
domains where an accumulation of UML in the conjunctive tissue could be 
observed (left) no Ret phosphorylation was observed (right) (measured in n 
= 3 mice for each condition). e, Expression of GDNF, artemin, and 
neurturin do not change in magnetized Apc’/'®®N colon explants. We 


observed no effect of magnetization of the tissue in the expression of any 
ligand of Ret as compared to the control not injected with UML, to the 
non-magnetized colon tissue injected with UML and to the wild-type 
kidney positive control expressing the three ligands of Ret (minimum of 
300 crypts analysed, measured in n = 2 mice for each condition). 

f, Mechanical inactivation of GSK-3B through Ser9 phosphorylation and 
mechanical activation of the upstream Akt through Ser473 phosphorylation 
in magnetized Apc-deficient colon explants. Increased phosphorylation of 
GSK-3B Ser9 was observed in the magnetized tissue at 2 weeks (33.8 + 
0.4% of 307 crypts), compared to the non-magnetized control (12.7 + 
5.15% of 560 crypts) and to the UML-injected colon sample without 
magnet (7.24 + 2% of 138 crypts) (measured in n = 2 mice for each 
condition). Increased phosphorylation of Akt Ser473 was observed in the 
magnetized tissue at 2 weeks (40.6 + 4.5% of 378 crypts, measured in n = 
3 mice), compared to the non-magnetized control (3.2 + 1.1% of 596 
crypts, measured in n = 3 mice) and to the UML-injected colon sample 
without magnet (2.7 + 0.2% of 225 crypts, measured in n = 2 mice). 

g, Ret-dependent mechanical inactivation of GSK-3f and activation of the 
upstream Akt after ex vivo global compression of Apc-deficient colon 
explants. GSK-3f: control, 12.5 + 5.2% positive crypts out of 320 
(measured in n = 2 mice); compressed, 43.26 + 11% positive crypts out of 
784 (measured in n = 2 mice); compressed + sunitinib, 19.7 + 12% 
positive crypts out of 216 (measured in n = 2 mice); compressed + 
vandetanib, 14.4 + 0.1% positive crypts out of 114 (measured in n = 2 
mice); compressed + ponatinib, 15.45 + 5.6% positive crypts out of 230 
(measured in n = 2 mice for each condition). Akt: control, 5.3 = 4.3% 
positive crypts out of 519; compressed, 46.6 + 3.5% positive crypts out of 
554; compressed + sunitinib, 14.26 + 8.3% positive crypts out of 560; 
compressed + vandetanib, 5.37 + 0.5% positive crypts out of 571; 
compressed + ponatinib, 7.52 + 0.8% positive crypts out of 443 (measured 
in n = 2 mice for each condition). 
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Extended Data Figure 8 | Notch/Apc colon shows an activation of the Ret/B- 
catenin/Myc mechanotransductive signalling pathway one month after 
tumour growth initiation. a, Top, Ret Y1062 phosphorylation (measured in n 
= 4 mice, 317 crypts), apical B-catenin Y654 phosphorylation (measured in n 
= 4 mice, 223 crypts), cytoplasmic and nuclear B-catenin enrichment 
(measured in n = 2 mice, 16 images analysed with 4-6 crypts per image), and 
Myc expression activation (at 2.5 months, measured in n = 2 mice, 194 crypts), 
in Notch/Apc tumorous tissue compared to non-tamoxifen-injected Notch/ 
Apc mice controls (measured in n = 4 mice, 264 crypts; measured in n = 4 
mice, 198 crypts; measured in n = 2 mice, 18 images analysed with 4-6 crypts 
per image; measured in n = 2 mice, 315 crypts, respectively). GFP fluorescence 
(green) reveals a nuclear overexpression of Notch in the tumorous crypts. Note 
that at 1 month after tamoxifen injection, GFP expression is often diffuse and 
found in the nuclei and cytoplasm. Immunofluorescence staining and Image] 
co-localization analysis revealed an enrichment of nuclear B-catenin (white and 
purple spots represent a positive co-localization between B-catenin (red) and 
DAPI (blue)) by a factor of 5.8 in Notch/Apc colon samples 1 month after 
tamoxifen injection (11.64 + 2.5 a.u., measured in n = 2 mice), compared to the 
control (1.99 + 0.8 a.u., measured in n = 2 mice). Bottom, phosphorylation of 


B-catenin Y654 in Notch-negative crypts in tamoxifen-injected crypts 
(measured in n = 4 mice, 68 crypts, yellow arrows) compared to non- 
tamoxifen-injected control conditions (measured in n = 4 mice, 198 crypts). 
Cytoplasmic enrichment and nuclear translocation of B-catenin in Notch- 
negative crypts (measured in n = 4 mice). Nuclear translocation of b-catenin is 
assessed by co-localization with DAPI (in white) in Notch-negative crypts of 
Notch/Apc tissues after 1 month of tamoxifen injection compared to non- 
tamoxifen-injected control conditions. White spots and purple represent a 
positive nuclear DAPI and f-catenin co-localization, with a preference for the 
peripheral privileged sites of transcriptionally active chromatin. 
Immunofluorescence staining and Image] co-localization analysis revealed an 
enrichment of nuclear B-catenin (white and purple spots represent a positive 
co-localization between B-catenin (red) and DAPI (blue)) by a factor of 15 in 
Notch-negative crypts in colon samples 1 month after tamoxifen injection (33.6 
+ 8.7 a.u., measured in n = 2 mice), compared to the control (1.99 + 0.8 a.u., 
measured in n = 2 mice). Ten images with 4-6 crypts per image were analysed 
in each condition. Scale bars are 10 um. b, Quantification of a. P < 0.001 in all 
cases, Student’s t-test. 
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Extended Data Figure 9 | Ret activation is non-cell autonomously induced 
by the mechanical strains developed by tumour pressure in Notch/Apc mice. 
a, Early mechanical activation of pTyr1062-Ret kinase in crypts totally 
surrounded by a Notch-negative domain in tumorous Notch/Apc colon 
explants. Notch/Apc mice were injected with tamoxifen for four consecutive 
days (instead of 5 days, see Methods) to induce tumour growth initiation, and 
colon explants were analysed on the fifth day by immunofluorescence. Ret 
phosphorylation was activated in 12.7 + 2.3% of Notch-negative crypts (22 
pTyr1062-Ret-positive crypts of a total of 177 GFP-negative crypts) completely 
surrounded by Notch-negative crypts (GFP-negative), measured in n = 4 mice. 
b, No phosphorylation of Ret Tyr1062 in Notch-overexpressing cells in early 
tumorous Notch/Apc colon explants. 34.9 + 8.8% of GFP-positive crypts 
(yellow arrows) showed no expression of pT yr1062-Ret (39 crypts out of a total 
of 113 crypts, measured in n = 4 mice). ¢, Strain deformation of Notch/Apc 4 
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days after tamoxifen injection. ¢44 = 0.89 + 0.08 (measured in n = 169 crypts, 
in three mice), é9q = 0.93 + 0.05 (measured in n = 20 crypts in two mice), P = 
0.004 (Mann-Whitney exact test), leading to a mean tumour stress of S4q = 0.9 
+ 0.1 kPa (following Hooke’s law S = E X é with Enotch/ape = 22.8 + 4.8 kPa 
(see Extended Data Fig. 1b)). Error bars are s.d. d, Expression of GDNF, 
artemin and neurturin Ret ligands does not change in tumour-initiated Notch/ 
Apc colon. We observed no effect of tumour growth in the expression of any of 
the ligands (red) as compared to the control Notch/Apc not injected with 
tamoxifen, and to the wild-type kidney positive controls in which the three 
ligands of Ret are expressed. GFP fluorescence reveals positive expression of 
Notch and labels the tumorous crypts (minimum 300 crypts analysed, 
measured in n = 2 mice for each condition). At 1 month after tamoxifen (tam.) 
injection, GFP expression is often diffuse and found in the nuclei and 
cytoplasm. 
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Extended Data Figure 10 | Summary of the results. a, Magnetically induced catenin promote tumorigenic target gene expression and ACF formation and 
tumour growth pressure. b, Notch-induced tumour growth pressure. c, Ret/B- are mechanically induced by tumour growth pressure. 
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MYC regulates the core pre-mRNA splicing 
machinery as an essential step in lymphomagenesis 


Cheryl M. Koh!*, Marco Bezzi!**, Diana H. P. Low', Wei Xia Ang’, Shun Xie Teo, Florence P. H. Gay', Muthafar Al-Haddawi', 
Soo Yong Tan!, Motomi Osato®, Arianna Sabo*, Bruno Amati*?, Keng Boon Wee®’” & Ernesto Guccione’”? 


Deregulated expression of the MYC transcription factor occurs in 
most human cancers and correlates with high proliferation, repro- 
grammed cellular metabolism and poor prognosis’. Overexpressed 
MYC binds to virtually all active promoters within a cell, although 
with different binding affinities” *, and modulates the expression 
of distinct subsets of genes’”*°. However, the critical effectors of 
MYC in tumorigenesis remain largely unknown. Here we show that 
during lymphomagenesis in Ep-myc transgenic mice, MYC 
directly upregulates the transcription of the core small nuclear 
ribonucleoprotein particle assembly genes, including Prmt5, an 
arginine methyltransferase that methylates Sm proteins®’. This 
coordinated regulatory effect is critical for the core biogenesis of 
small nuclear ribonucleoprotein particles, effective pre-messenger- 
RNA splicing, cell survival and proliferation. Our results dem- 
onstrate that MYC maintains the splicing fidelity of exons with a 
weak 5’ donor site. Additionally, we identify pre-messenger-RNAs 
that are particularly sensitive to the perturbation of the MYC- 
PRMT5 axis, resulting in either intron retention (for example, 
Dvl1) or exon skipping (for example, Atr, Ep400). Using antisense 
oligonucleotides, we demonstrate the contribution of these splic- 
ing defects to the anti-proliferative/apoptotic phenotype observed 
in PRMT5-depleted Eu-myc B cells. We conclude that, in addition 
to its well-documented oncogenic functions in transcription” * and 
translation®, MYC also safeguards proper pre-messenger-RNA 
splicing as an essential step in lymphomagenesis. 

We recently provided an overview of the transcriptional networks 
perturbed by MYC during lymphomagenesis*: to this aim, we estab- 
lished MYC binding and gene expression profiles in B cells from non- 
transgenic control mice (C), as well as pre-tumoural B cells (P) and 
lymphomas (tumour, “T’) in Eu-myc animals*. Among the gene sets 
upregulated by MYC, we identified components of the spliceosome 
(Fig. 1a). Recent genome-wide association studies have uncovered a 
high rate of mutations in splicing regulators’"’, pinpointing their 
potential involvement as driver oncogenes; consequently, there is 
growing interest in drugging the spliceosome machinery for anti-can- 
cer therapy’*. Among the significantly modulated genes involved in 
splicing, MYC promoted transcription of the core small nuclear ribo- 
nucleoprotein particle (snRNP) assembly genes, of which PRMTS is 
the key enzymatic component”? (Fig. 1b and Extended Data Fig. 1a): 
these genes showed progressive transcriptional increases in the pre- 
tumoural and tumour stages, as assessed by RNA sequencing (RNA- 
seq), validated by quantitative PCR (Fig. 1b, c and Extended Data Fig. 
1b, c) and verified at the protein level (Extended Data Fig. 1d). These 
genes were also bound by MYC in the proximity of their transcription 
start site at the pre-tumoural stage, mostly on canonical E-boxes 
(CACGTG) (Fig. 1c and Extended Data Fig. 1b). The correlation 
between MYC and the core snRNP assembly genes also held true in 


publicly available data sets from human lymphoma and patients with 
leukaemia (Extended Data Fig. le), and was validated at the RNA level 
in 29 primary samples (Extended Data Fig. 1f and Supplementary 
Table 1) and by immunohistochemistry for PRMT5 on 40 samples 
from large B cell, marginal zone, follicular- and mantle-cell lymphomas 
(Extended Data Fig. 1g and Supplementary Table 2), confirming the 
upregulation of PRMT5 in samples overexpressing MYC. Normal ton- 
sils and lymph nodes had lower PRMT5 protein expression (Extended 
Data Fig. 1g). Importantly, the high expression of PRMT5 was an 
accurate predictor of bad prognosis/survival in cohorts of large diffuse 
B-cell lymphomas (Fig. 1d and Extended Data Fig. 1h). On the basis of 
these observations, we hypothesized that coordinate increases in the 
concentrations of core splicing factors, in particular PRMT5, might be 
critical for sustaining oncogenic growth in MYC-driven tumours. In 
line with this concept, glioblastoma stem cells, which overexpress 
MYC, are more sensitive to splicing inhibition than neural stem cells’. 

Unlike Prmt5~’~ mice, which are embryonic lethal’, Prmt5*/— 
mice appear normal and have normal blood counts (data not shown). 
To address whether PRMTS is limiting in Myc-induced lymphoma- 
genesis, we thus followed disease development in Eu-mye;Prmt5*'~ 
mice. Remarkably, lymphoma development was profoundly delayed in 
the Prmt5*!— background (Fig. le) (P<0.0001). The Ep- 
myc;Prmt5*'~ mice also had significantly reduced disease burden 
than their age-matched Eu-myc littermates, both phenotypically and 
histologically (Extended Data Fig. 2a—c)"®. 

To dissect the mechanistic basis of PRMT5 dependency in MYC- 
driven lymphomagenesis, we acutely deleted PRMT5 in pre-tumoural 
bone-marrow-derived Eu-mye;Prmt5"” ¥CreER B cells. Acute Prmts 
deletion was induced ex vivo by the activation of CreER, which signifi- 
cantly reduced viability (Fig. 2a), increasing both apoptosis and G1 
arrest (Fig. 2b). Notably, these phenotypic effects of acute Prmt5 dele- 
tion were virtually absent in non-transgenic Prmt5'”" CreER bone mar- 
row pre-B cells (Fig. 2a, b). Consistent with our previous report’, 
PRMT5 depletion led to a reduction in methylated Sm proteins 
(Y12; Extended Data Fig. 3a) and the aberrant splicing of Mdm4 
(Extended Data Fig. 3b). We then used RNA-seq to profile the abund- 
ance of alternative mRNA species in Eu-mye;Prmt5"” *CreER bone 
marrow pre-B cells in the presence (EtOH) or absence (OHT) of 
PRMT5. We _ identified 3,245 differentially expressed genes 
(Supplementary Table 3). Consistent with the observed phenotype, 
the functional annotation highlighted defects in cell proliferation, 
among other pathways (Extended Data Fig. 3c), and we observed that 
the compiled number of reads in introns was elevated in the absence of 
PRMTS5 (Extended Data Fig. 3d). Interestingly, multivariate analysis of 
transcript splicing (MATS) identified 153 alternative splicing events 
(mainly retained introns and skipped exons (Extended Data Fig. 3e and 
Supplementary Table 4) with weak 5’ donor sites) (Extended Data 
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Figure 1 | MYC directly upregulates the core snRNP assembly genes, 
including PRMT5. a, Gene set enrichment analysis showing the top enriched 
categories for Myc-bound and overexpressed transcripts in Ept-myc pre- 
tumoural B cells. b, Heat map showing the expression of RNA splicing and core 
SNRP assembly genes in wild-type (n = 3), Eu-myc pre-tumoural (n = 3) and 
Eul-myc tumour B cells (n = 4). c, ChIP-seq analysis of MYC binding (top three 


Fig. 3f), affecting gene products involved in regulating cellular response 
to stress, signal transduction and chromatin organization (among 
others) (Fig. 2c). Remarkably, we were able to validate 22 out of 22 
of these alternative splicing events in Eu-mye;Prmt5"” *CreER bone 
marrow pre-B cells, while less significant changes were observed in 
non-transgenic Prmt5'”"CreER bone marrow pre-B cells (Fig. 2d and 
Extended Data Fig. 4a). In summary, upon PRMT5 depletion in B cells, 
we observed a strong correlation between MYC levels and the severity 
of both splicing defects and cell proliferation (Fig. 2a, b, d). On the basis 
of our RNA-sequencing results, we predicted that several of the alterna- 
tive spliced mRNAs, which were generated in cells with reduced 
PRMTS5 levels, would be targeted for nonsense-mediated messenger 
RNA decay and/or be out of frame (Supplementary Table 5). 
Accordingly, we verified that the observed alternative splicing events 
(both skipped exons and retained introns) resulted in the reduction of 
the full-length proteins in seven cases tested (ATR, MDM4, DVL1, 
PRPF40, CEP110, PHKG2 and EP400) (Extended Data Fig. 4b). 

The critical role of Prmt5 in lymphoma development suggested that 
it may also be essential for tumour maintenance. To address this 
question, we transplanted severe combined immunodeficiency 
(SCID) recipient mice with tumour cells derived from Ep- 
myc;Prmt5"” *CreER mice, and subsequently deleted Prmt5 in vivo 
(Fig. 3a). Remarkably, deletion of one copy of PRMT5 slowed disease 
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lanes) to the promoter region of Prmt5 and RNA-seq (bottom lanes) showing 
the expression of Prmt5 in control (C), E-myc pre-tumoural (P) and Ept-myc 
tumour B cells (T). d, Survival of patients with lymphoma stratified by PRMT5 
expression. e, Kaplan-Meier analysis of tumour-free survival of Eu-myc 

(n = 61) and Ep-myc;Prmt5‘/~ mice (n = 58). 


onset, while homozygous deletion resulted in full disease-free survival 
(Fig. 3a and Extended Data Fig. 5a, b). Acute Prmt5 deletion ex vivo, in 
the same Eu-myc;Prmt5'”*CreER lymphoma cells, reduced viability, 
increased both apoptosis and G1 arrest (Fig. 3b) and induced alterna- 
tive splicing events similar to those observed in pre-tumoural B cells 
(Fig. 3c and Extended Data Fig. 5c). Both phenotypic and splicing 
defects were observed in all samples tested, irrespective of a functional 
ARF-MDM2-p53 pathway (Extended Data Fig. 5d). The partial 
reduction of PRMTS5 in Eu-myc;Prmt5*“CreER lymphoma cells only 
resulted in a subset of the observed alternative splicing events 
(Extended Data Fig. 5e), possibly accounting for the intermediate 
phenotype observed (Fig. 3a). 

PRMT5 depletion in the human Burkitt lymphoma cell lines Raji 
and Daudi reduced viability (Extended Data Fig. 6a), increased the 
number of apoptotic cells (Extended Data Fig. 6b, c) and reduced their 
tumorigenic potential in xenografted mice (Extended Data Fig. 6d). 
The depletion of PRMT5 also reduced the levels of methylated Sm 
proteins (Y12) (Extended Data Fig. 6e) and resulted in the aberrant 
splicing of pre-mRNAs with weak 5’-donor sites, causing both exon 
skipping and intron retention (Extended Data Fig. 6f), as previously 
observed in murine cells (Fig. 3c and Extended Data Fig. 5c). In accord- 
ance with these data, the depletion of another core component of the 
splicing machinery (SmB) resulted in the perturbation of similar targets, 
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Figure 2 | PRMT5 is essential for maintaining splicing fidelity. a, Viability 
of Prmt5'"CreER and Eu-mye;Prmt5”” *CreER pre-B cells after Prmt5 deletion 
(n = 4 for each genotype). b, Apoptosis and the cell-cycle profile of the pre-B 
cells after Prmt5 deletion, as assessed by flow cytometry (n = 5). c, Gene 
ontology of alternatively spliced transcripts detected upon PRMT5 depletion in 
Eu-myc;Prmt5"’*CreER bone marrow B cells. d, Validation of alternatively 
spliced transcripts in Eu-myc;Prmt5""CreER bone marrow B cells after Prmt5 
depletion (OHT) by semi-quantitative PCR. Importantly, these changes were 
considerably less pronounced after PRMT5 deletion in B cells from 
Prmt5"CreER mice. Quantification of three independent biological replicates 
is shown on top, while a representative example is shown in the bottom panel. 
Data are the average and s.d. Student’s t-test (two-sided) was used; *P < 0.05 
and **P< 0.01. 


and in reduced cell viability (Extended Data Fig. 6g). These data confirm 
that PRMT5 is indeed a key regulator of the core splicing machinery, 
and is essential for maintaining splicing fidelity in lymphoma. 
Importantly, in both mouse (Eu-myc B cells) and human (Raji/ 
Daudi) lymphoma cells, the downregulation of MYC led to reduced 
PRMT5, SmD1 and SmD3 mRNA and protein levels (Fig. 3d and 
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Extended Data Fig. 7a, c), and resulted in aberrant splicing 
events (Fig. 3e and Extended Data Fig. 7b, d), similar to those observed 
upon PRMT5 or SmB depletion (Fig. 3c and Extended Data Figs 5c 
and 6f, g). 

Our data show that the effect of PRMT5 deletion is more pro- 
nounced in proliferating cells expressing high MYC levels (Fig. 2a, b, 
d and Extended Data Fig. 4a), suggesting that systemic Prmt5 
deletion in vivo may cause adverse side effects. Indeed, the constitutive 
deletion of Prmt5 in adult mice and in embryos during mid-gestation 
resulted in death’. Prmt5 deletion in utero at embryonic day (E)10.5 
resulted in a reduction in fetal liver cellularity (Extended Data Fig. 8a), 
with a reduction of both kit*lin’ ;CD34” and kit ‘lin’ ;CD34" cells 
(Extended Data Fig. 8b). The remaining cells were severely impacted in 
their ability to form haematopoietic colonies in vitro (Extended Data 
Fig. 8c), and their expression profile indicated a strong upregulation of 
apoptotic pathways, among others (Extended Data Fig. 8d). We 
observed similar reductions in cellularity and haematopoietic colony 
formation in adult Prmt5"’"ER bone marrow cells after Prmt5 deletion 
in vivo (Extended Data Fig. 8e, f), which was accompanied by a marked 
reduction in mature lymphocytes, neutrophils and erythrocytes in the 
peripheral blood (Extended Data Fig. 8g). Additionally, the deletion of 
Prmt5 in lethally irradiated mice reconstituted with Prmt5‘"ER bone 
marrow haematopoietic progenitors’’ resulted in a significant reduc- 
tion in bone marrow cellularity, which subsequently led to the death of 
the animals (Extended Data Fig. 8h). 

In summary, the complete elimination of PRMT5 had severe con- 
sequences on normal physiology; yet, the overexpression of PRMT5 
observed in various haematopoietic malignancies (Fig. la~c and 
Extended Data Fig. 1), the delayed progression of lymphomas after 
heterozygous deletion of the gene (Fig. le), the sensitization by Myc to 
loss of PRMT5 (Fig. 2a, b) and the dependency on PRMT5 for tumour 
maintenance (Fig. 3a, b) suggest that a therapeutic window for PRMT5 
inhibition may exist. 

To further explore the functional consequences of aberrant splicing 
resulting from PRMT5 depletion, we decided to focus on Atr and 
Ep400 (skipped exons), given their role in preventing MYC-induced 
replicative stress'* and in cooperating with MYC transcriptional activ- 
ity’? and MYC-driven cell-cycle progression” respectively; and on 
Dyll1 (retained intron), given the well-documented interplay between 
MYC and the Wnt pathway in promoting tumorigenesis”’. We 
designed steric hindrance antisense oligonucleotides (ASOs)” to 
induce specific Atr exon 20 skipping, Ep400 exon 48 skipping and 
Dyll intron 3 retention (Fig. 4a, Supplementary Table 6 and 
Extended Data Fig. 9a). The electroporation of the ASOs into Ep- 
myc B cells resulted in the specific induction of the alternative splicing 
event, leading to the reduction of Atr, Ep400 and Dvll1 full-length 
mRNAs and protein levels (Fig. 4b and Extended Data Fig. 9b), reduc- 
tion of cell viability (Fig. 4c) and the induction of apoptosis (Fig. 4d). 
We have thus identified an essential regulatory mechanism that, 
through the transcriptional control of PRMT5 and other snRNP com- 
ponents, allows MYC to regulate the maintenance of a fully functional 
splicing machinery. This enables proper post-transcriptional proces- 
sing and consequent expression of full-length proteins (for example, 
ATR, EP400 and DVL1, among others), sustaining cancer cell survival 
and proliferation. The reversal of a single splicing event (for example, 
intron retention of Dv/1), after PRMTS5 depletion, only modestly res- 
cued cell viability (Extended Data Fig. 9c). This is not surprising, given 
that multiple splicing events are affected upon perturbation of the core 
splicing machinery. 

After our genome-wide mapping of MYC binding and gene regu- 
lation*, we show here that MYC is a master regulator of the core snRNP 
machinery, ensuring proper mRNA splicing. Other links had prev- 
iously been made between MYC and splicing: first, PTPB, hnRNPA1 
and hnRNPA2 levels correlate with those of MYC, regulating express- 
ion of the PKM2 isoform of pyruvate kinase”; second, MYC drives the 
transcription of the oncogenic splicing regulator, SRSF1 (refs 24, 25). 
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Figure 4 | Antisense oligonucleotides targeting ATR-, EP400- and DVL1- 
alternative splicing mimic the cell-cycle arrest/apoptotic response induced 
by MYC/PRMTS5 depletion. a, Schematic representation of ASOs designed to 
induce exon skipping (ATR and EP400) and intron retention (DVL1). 

b, Validation of efficacies of ASOs in inducing alternatively spliced transcripts. 
Representative gel images are shown (n = 4). ¢, Cell viability of Eu-myc B cells 


after electroporation with the respective ASOs (n = 4). d, Cell-cycle profiles of 
Ey-myc B cells after electroporation with the respective ASOs (n = 4). The P 
values indicate the significance in the difference in percentage apoptosis, 
compared with controls. Data are the average and s.d. Student's t-test (two- 
sided) was used; *P < 0.05, **P< 0.01. 
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Our report, however, provides the first evidence that MYC regulates the 
constitutive splicing machinery, including the Sm proteins and PRMT5 
(refs 6, 7). Taken together, our results support a model (Extended Data 
Fig. 10) whereby MYC overexpression forces the cell to rely on high 
levels of PRMT5 and mature snRNPs to sustain splicing fidelity, most 
probably because of the documented direct and indirect RNA amp- 
lification levels in cancer cells** (see Supplementary Information for 
additional discussion). These data point at the possibility of drugging 
the spliceosome, either by directly affecting its core components” or by 
targeting PRMT5 methyltransferase activity, as a potentially viable 
strategy in MYC-driven tumours, although potential side effects 
(Extended Data Fig. 8) will have to be carefully taken into account. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. 

Mouse strains and genotyping. The floxed PRMT5 mice were described prev- 
iously’. 4-Hydroxytamoxifen (4-OHT)-inducible conditional knockouts were cre- 
ated by crossing Prmt5"”” mice with Rosa26-CreERT2 transgenic mice” (in mixed 
C57BL/6 X 129S1/SvlmJ background). The Eu-myc were purchased from the 
Jackson Laboratory and SCID mice were obtained from BRC, A*STAR. 

Animal studies. The experimental protocol was approved by the Institutional 
Animal Care and Use Committee (IACUC), and the animals were maintained 
under compliance with institutional guidelines. The sample size was not pre- 
selected and no inclusion/exclusion criteria were used. All mice were monitored 
daily for signs of morbidity and lymphoma development. Whole blood was col- 
lected from the tail vein of mice using heparinized microhaematocrit capillary 
tubes (Fisher 22-362-566) and analysed using an NK MEK-6318 haematology 
analyser (Nihon Koden Corporation). For tissue collection, the mice were killed 
according to IACUC guidelines, and the spleens, tumours and bone marrow were 
collected for subsequent analysis. 

Isolation of haematopoietic cell populations. Haematopoietic cells were 
extracted from fetal liver. Flow cytometric analyses by FACSAria (BD 
Biosciences) were performed using a standard method. Briefly, blocking non- 
specific binding in mouse serum was followed by antibody staining on ice for 
30 min. After washing, cells were suspended in 0.5 ml of phosphate-buffered saline 
with propidium iodide or Hoechst 33258 for dead-cell discrimination. On the 
screen of flow cytometry, homogenous populations in cell size (forward scatter- 
side scatter window) and viable cells were gated and analysed for individual 
antigen expressions. Lineage-markers included Grl, Terl119, CD3, CD4, CD8 
and B220. Monoclonal antibodies were purchased from BD Biosciences or 
eBioscience: PE-conjugated anti-Grl (RB6-8C5), Terl19 (TER-119), CD3 (145- 
2C11), CD4 (RM4-5), CD8 (53-6.7), B220 (RA3-6B2), PE-Cy7-conjugated anti-c- 
Kit (2B8) and FITC-conjugated CD34 (RAM34). 

For colony formation assays, haematopoietic cells were extracted from E14.5 
fetal liver or adult bone marrow. Methocult M3630 and M3434 (Stem Cell 
Technologies) assays were set up according to the manufacturer’s protocol. 
Colonies were enumerated 10-14 days after plating under bright field microscope. 
Immunohistochemistry staining. Automated immunohistochemistry (IHC) 
staining and counterstaining were performed on a Leica Bond-Max’ autostainer. 
Cell culture. Primary Eu-myc lymphoma and bone marrow cells were isolated 
and maintained, as previously described’’”*. Raji and Daudi cells were maintained 
in RPMI1640 and 293T cells were maintained in DMEM. All media were supple- 
mented with 10% FBS and 1% penicillin-streptomycin. All cells were grown in a 
humidified incubator at 37 °C and 5% CO. The shRNA sequences for lentiviral- 
based knockdown were as follows: shPRMT5-1, CCGGGGCTCAAGCCACCAA 
TCTATGCTCGAGCATAGATTGGTGGCTTGAGCCTTTTTG; = shPRMT5-2, 
CCGGCCCATCCTCTTCCCTATTAAGCTCGAGCTTAATAGGGAAGAGGA 
TGGGTTTTTG; shSMB-1, CCGGCCACAAGGAAGAGGTACTGTTCTCGAG 
AACAGTACCTCTTCCTTGTGGTTTTTG; shSMB-2, CCGGCACATGAATTT 
GATCCTCTGTCTCGAGACAGAGGATCAAATTCATGTGTTTTTG; shMYC 
(mouse), GGAGATGATGACCGAGTTA; shMYC-1 (human), GATGAGGAAG 
AAATCGATG; shMYC-2 (human), GATGAGGAAGAAATCGATA. 
Antibodies. The following antibodies were used: actin (Santa Cruz, SC-47778), 
PRMT5 (Santa Cruz, SC-22132), c-MYC (Santa Cruz, SC-40), SNRPD1 (Abcam, 
ab50940), SNRPD3 (Abcam, ab111094), and Smith antigen (Abcam, ab3138). 
ChIP sequencing and RNA sequencing library preparation. ChIP-seq and 
RNA-seq were performed according to standard methods (Illumina), or as prev- 
iously described®. 

RNA sequencing and analysis. Paired-end sequencing (150 base pairs) was per- 
formed on an Illumina Nextseq 500. Reads were mapped to the mm9 mouse 
genome assembly using Tophat (http://tophat.cbcb.umd.edu/) (version 2.0.9) with 
the following parameters: [-p 8-splice-mismatches 1-segment-mismatches 2]. 
Aligned reads were then quantified for expression using the Cufflinks suite version 
2.1.1(http://cufflinks.cbcb.umd.edu/) and edgeR (http://www.bioconductor.org/ 
packages/release/bioc/html/edgeR.html). Downstream manipulation of RNA- 
seq results was done with CummeRbund (version 2.0) and with in-house scripts. 
Genes were considered to be significantly differentially expressed at P< 0.05 and 
abs(log[FPKM])ratio > 1. Gene set testing used the mroast function from the R 
package limma. To determine differential splicing events, MATS 3.0.8 beta (http:// 
rnaseq-mats.sourceforge.net/) was used to count junction reads and reads falling 
into the tested region within ENSEMBL gene definitions. Splicing events were 
labelled significant if the sum of the reads supporting a specific event exceeded ten 
reads, and P< 0.05. 

ChIP-sequencing. ChIP-seq data sets analysed were obtained from ref. 4. 
Alignments were generated from Illumina Fastq files to the mm9 genome using 
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Bowtie 0.12.8 (http://bowtie-bio.sourceforge.net/index.shtml). Peak calling was 
done using MACS (https://github.com/taoliu/MACS/) (version 2.0.9) with 
the following parameters: [-bw 300 -m 5,30-shiftsize 100 -p 1e-3-slocal = 1000 
-B-keep-dup = 1]. Peak distributions were defined using RefSeq annotation. 
Distances within —5 kilobases (kb)/+2 kb of a gene transcription start site were 
defined to be the promoter. To identify overlapping peaks between data sets, a 
region was first defined as a +1 kb window from an enriched peak centre. This 
region was then compared against the list of similarly defined regions from a 
corresponding data set. 

Other bioinformatics analysis. All GO term analysis used DAVID (http://davi- 
d.abcc.nciferf.gov)”’. Density plots of genomic loci were generated using 
IGB (http://bioviz.org/igb/). GO reduction and connectivity graphs were done 
with REVIGO (http://revigo.irb.hr/) and Cytoscape version 3.0.2 (www.cytosca- 
pe.org/). Public Gene Expression Omnibus (GEO) data sets were retrieved using 
the R package ‘GEOquery’. Survival analysis was done in R using the ‘survival’ 
package. 

Accession numbers. Microarray data from public data sets were obtained from 
GEO with the accession numbers GSE4475, GSE22470 and GSE10846. RNA- 
sequencing data from ref. 4 were obtained from GSE51011. RNA-sequencing 
and microarray data sets generated in this study have been deposited in Gene 
Expression Omnibus database in the Superseries GSE61638. 

Tamoxifen injections in vivo and 4-OHT treatment in vitro. For in vivo experi- 
ments, 2 mg tamoxifen (Sigma, T5648) was administered intraperitoneally in adult 
mice for three consecutive days. For in vitro experiments, cells were treated with 
50 nM 4-OHT or an equal volume of EtOH as a control for 24h, after which the 
cells were washed and resuspended in fresh media. In all experiments with 
Prmt5'"CreER-derived cells or animals, we used ROSA26:CreER counterparts 
as negative controls, ensuring that the addition of OHT or tamoxifen was not 
toxic”. The doses of tamoxifen and 4-OHT were carefully titrated to the minimum 
necessary for effective recombination, with minimal side effects. 

Cell viability assays. Cell viability was assessed by a CellTiter-Glo Luminescent 
Cell Viability Assay (Promega), according to the manufacturer’s protocol. 

Flow cytometry analysis. The spleens and lymphomas were collected and single- 
cell suspensions were isolated in PBS with 10% FBS. The cells were stained with 
500 pg ml” ' propidium iodide in 1% w/v Triton X-100 in PBS with 2 mg RNase A 
(Sigma) at room temperature (23 °C) for 30 min, and subsequently analysed by 
flow cytometry analysis. Dead cells and debris were excluded by gating on the basis 
of their forward and side scatter characteristics. 

Lymphoma transplant studies. These were done as previously described*'. 
Briefly, 6- to 8-week old SCID recipient mice were transplanted with 10° primary 
lymphoma cells by tail vein injection and monitored daily for signs of disease 
manifestation. Recipient mice from different litters were randomized equally for 
group allocation. No blinding was done. 

Statistical analysis. Student’s t-test (two-sided) was applied, and changes were 
considered statistically significant when P< 0.05. In the figures, changes with 
*P<0.05 and **P<0.01. The data were normally distributed and variation 
within and between groups was not estimated. The sample size was not pre- 
selected and no inclusion/exclusion criteria were used. The data shown are the 
averages and s.d. of at least three biological replicates (that is, cells isolated from at 
least three independent mice). Statistical analysis used Microsoft Excel or 
GraphPad Prism software. 

ASOs. Novel ASOs were applied to bind to nascent transcripts of a target gene via 
Watson-Crick bonding to exert steric hindrance effect against splicing factors to 
modulate splicing. The ASOs were rationally designed for optimal efficiency in 
inducing splicing modulation, as previously described**”. Briefly, ASO target sites 
were selected by a computational algorithm that accounted for co-transcriptional 
binding accessibilities, binding thermodynamics and presence of regulatory splic- 
ing motifs. All the designed ASOs were synthesized by Sigma Aldrich (Singapore) as 
single-stranded 2'-O-methyl modified RNA bases linked with phosphorothioate 
backbone. 

Each ASO inducing specific exon-skipping events in ATR and EP400 genes was 
designed to bind exonic sequences to mask respective splicing motifs that were 
required for the proper splicing of the target exon. On the other hand, ASOs 
inducing a specific intron retention event in the DVL1 gene were designed to bind 
the target intron splice sites either separately or simultaneously, as depicted in 
Fig. 4a, to mask the boundaries of its flanking exons. In the latter, either two 
distinct ASOs or a dual-targeting ASO (that is, an ASO targeting two non-con- 
secutive target sites’*) were used. 

The sequence of each ASO and the experimental design for validating DVL1 
targeting ASOs is tabulated in Supplementary Table 6. 

ASO transfection protocol. ASOs were electroporated into Eu-myc B cells at a 
final concentration of 100 nM using the Neon Transfection System (Invitrogen). 
The electroporation parameters were optimized individually for B cells isolated 
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from separate mice, and were typically 1750 V X 20s X one pulse, 1350 V X 
30s X one pulse or 1200 V X 20s X two pulses. 
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Extended Data Figure 1 | MYC directly upregulates the core snRNP 
assembly genes. a, RNA-seq expression of all Refseq genes, core Snrnp genes, 
spliceosome genes and RNA splicing genes in wild-type, Ey-myc pre-tumoural 
B cells and Eu-myc tumour B cells. b, ChIP-seq analysis of MYC binding (top 
lanes) and RNA-seq (bottom lanes) showing the expression of ‘core snRNPs 
assembly’ genes in control (C), Eu-myc pre-tumoural (P) and Eut-myc tumours 
(T). Student’s t-test (two-tailed) was used; *P < 0.05, **P < 0.01. 

c, Quantitative PCR validation of the expression of representative ‘core snRNPs 
assembly’ genes in wild-type B cells (n = 7), Eu-myc pre-tumoural B cells 

(n = 4) and Epl-myc tumour cells (n = 16). d, Immunoblots of representative 


“core snRNPs assembly’ proteins in wild-type B cells (n = 4) and Eu-myc 
tumour cells (n = 4). e, Spearman’s r and P values for the correlation between 
the expression of MYC and PRMT5, WDR77, SNRPB, SNRPD1 and SNRPD3 in 
publicly available lymphoma data sets. f, Correlation between the expression of 
PRMT5 and MYC in 29 samples from patients with primary leukaemia and 
lymphoma. g, Quantification of PRMT5 staining in normal human lymph 
nodes and lymphoma. A representative image for IHC staining of PRMT5 in 
normal human tonsil and follicular lymphoma. h, Survival of patients with 
lymphoma stratified for PRMT5 and MYC expression, individually and 
combined. 
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Extended Data Figure 2 | Reduced disease burden in Eu-myc;Prmt5*’— Eu-myc (1 = 15) and Ep-myc;Prmt5‘’~ (n = 15) littermates. Each point 
mice. a, Analysis of WBC counts in 8-week-old wild-type (n = 20), Prmt5*/~ __ represents one animal. c, Representative images showing haematoxylin and 
(n = 13), Eu-myc (n = 21) and Eu-myc;Prmt5*’~ (n = 19) littermates. Each eosin and IHC for PRMT5, B220, Ki67 and Y12 in the spleens of 8-week-old, 
point represents one animal. b, Analysis of tumour burden in 8-week-old age-matched wild-type, Prmt5*’~, Eyi-myc and Eyt-myc;Prmt5*’~ littermates. 
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Extended Data Figure 3 | PRMT5 depletion leads to a reduction in 
functioning of the core splicing machinery. a, A representative immunoblot 
showing the reduction of PRMT5 protein levels after OHT treatment, and a 
corresponding decrease in Y12 levels (methylated Sm proteins). b, Validation 
of alternative splicing of Mdm4 mRNA in Ept-myc B cells after PRMT5 
deletion. c, Functional annotation of genes affected by alternative splicing 
events (either exon skipping or intron retention). d, Distribution of reads 
within exons and introns (ratio of OHT/EtOH). e, MATS output: 


quantification of skipped exons (SE), retained introns (RI), mutually exclusive 
exons (MXE), A5SS and A3SS (alternative 5’ or 3’ splice site). f, Shapiro score of 
the 5’ donor sites of retained intron events identified by MATS. A smooth 
density estimate is drawn as calculated by a Gaussian kernel. Top: sequence 
logo of the 5'donor of retained intron events (left) compared with the 5’donor 
of the downstream exon (right), detected in PRMT5-depleted cells. Bottom: 
sequence logo of the 5'donor sites of the skipped exon events (right), compared 
with the 5’donor of the upstream exon (left). 
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Extended Data Figure 4 | Alternative splicing events in Ep- independent biological replicates is shown on top, while a representative 
myc;Prmt5"" CreER bone marrow pre-B cells lead to reduction in full example is shown in the bottom panel. b, Quantification of selected proteins by 
length protein levels. a, Validation of additional alternatively spliced immunoblotting or flow cytometry in Epi-myc;Prmt5”’”*CreER bone marrow B 


transcripts in Eu-mye;Prmt5™” *CreER and Prmt5‘"CreER bone marrow B cells __ cells after Prmt5 deletion ( OHT). 
after Prmt5 deletion (OHT) by semi-quantitative PCR. Quantification of three 
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Extended Data Figure 5 | Alternative splicing events in Eu-myc;Prm 
CreER lymphoma cells. a, Disease burden of recipient mice (n = 5 for each 
group), as assessed by white blood cell (WBC) counts (left panel), spleen weight 
(middle panel) and tumour weight (right panel), 3 weeks after transplantation. 
b, Representative images showing the haematoxylin and eosin and IHC for 
B220, Ki67, PRMT5 and Y12 in the spleens of recipient mice. c, Validation of 
additional alternatively spliced transcripts in E1-myc;Prmt5""CreER 
lymphoma B cells after Prmt5 deletion (OHT) by semi-quantitative PCR. 
Representative gel images are shown (n = 5). d, Expression of p53 target genes 
after Prmt5 deletion in Eu-mye;Prmt5” *CreER lymphoma cells, as a 
demonstration of functional/inactive p53 response. Of the 22 lymphomas, each 


t GE/E 


isolated from independent tumour-bearing Ep-myc;Prmt5s”” ¥CreER mice, five 
(22.72%) showed the upregulation of classical p53 target genes in response to 
PRMT5 deletion (indicating a functional p53 pathway), while 17 (77.27%) did 
not (indicating an inactive p53 pathway), rates that are similar to previous 
reports”. e, Validation of alternatively spliced transcripts in Ej1- 
myc;Prmt5*"CreER lymphoma B cells after Prmt5 deletion (OHT) by semi- 
quantitative PCR. Representative gel images are shown (n = 3). f, Validation of 
additional alternatively spliced transcripts in after Myc knockdown in Ep-myc 
lymphoma B cells after Prmt5 deletion (OHT) by semi-quantitative PCR. 
Representative gel images are shown (n = 4). 
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Extended Data Figure 6 | PRMT5 and SmB depletions in human Burkitt 
lymphoma lines lead to reduced viability and splicing defects. a, Relative 
viability of Raji and Daudi cells upon PRMT5 depletion with four independent 
shRNAs (shPrmt5-1 to shPRMT5-4) (n = 4). b, Apoptosis profile of Raji and 
Daudi cells after PRMT5 knockdown (shPrmt5-1 to shPRMT5-4) (n = 4). 

c, Cell-cycle profile of Raji and Daudi cells after PRMT5 knockdown (shPrmt5- 
1 and shPRMT5-4) (n = 4). d, Control and PRMT5 depleted (shPrmt5-1 and 
shPRMT5-2) Daudi and Raji cells were xenografted into SCID recipients. The 
Kaplan-Meier analysis of tumour-free survival of the recipient mice is shown; 
n, number of recipient mice analysed. e, PRMT5 and Y12 (methylated Sm 


proteins) levels upon PRMT5 depletion in Raji and Daudi cells. A 
representative blot is shown. f, Validation of additional alternatively spliced 
transcripts (retained introns and skipped exons) after PRMT5 knockdown in 
Daudi and Raji cells by semi-quantitative PCR (n = 4). g, Validation of SMB 
knockdown in Daudi cells (upper left panel) and relative viability after SMB 
knockdown (lower left panel) (n = 3). Validation of additional alternatively 
spliced transcripts (retained introns and skipped exons) after SMB knockdown 
in Daudi cells by semi-quantitative PCR and quantification of gels by Image J. 
The data presented in this figure are the average and s.d. Student’s t-test (two 
sided) was used; *P < 0.05, **P< 0.01. 
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Extended Data Figure 7 | Myc depletion leads to splicing defects both in 
mouse and in human lymphoma cells. a, Relative expression of selected “core 
snRNPs assembly’ genes following MYC knockdown in Eu-myc B cells, 
assessed by quantitative real-time PCR (n = 3). b, Validation of alternatively 
spliced transcripts (retained introns and skipped exons) after MYC knockdown 
in Eu-myc B cells by semi-quantitative PCR (n = 3; representative gel images 
are shown). c, MYC, PRMT5, SmD1, SmD3 and f-actin protein expression in 
whole cell lysates from Daudi and Raji cells infected with viruses encoding non- 


targeting shRNA (shControl) and two independent sequences targeting MYC 
(shMyc-1, shMyc-2). d, Validation of alternatively spliced transcripts (retained 
introns and skipped exons) after MYC knockdown in Daudi and Raji cells by 
semi-quantitative PCR (n = 3). shControl, scramble control shRNA; shMyc-1, 
shMyc-2, shRNA sequences targeting MYC. The data presented in this figure 
are the average and s.d. Student’s t-test (two-sided) was used; *P < 0.05, 
**P< 0.01. 
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Extended Data Figure 8 | PRMT5 is required for normal haematopoiesis. 
a, Fetal liver cellularity at E14.5, after tamoxifen injection at E10.5 
(Prmt5*"CreER: n= 3, Prmt5**CreER, n= 5). b, Flow cytometry analysis of 
Kit* Lin” CD34* (bottom panel), Kit* Lin” CD34™ (top panel), using fetal liver 
cells from a. Total number of cells is indicated (Prmt5 */FCreER: n= 3, 
Prmt5"*CreER, n = 5). c, Methocult M3434 colony formation assay using fetal 
liver cells from a. CFU-GEMM, colony-forming unit—granulocyte, erythrocyte, 
monocyte/macrophage, megakaryocyte; CFU-GM, colony-forming unit— 
granulocyte, macrophage; CFU-M, colony-forming unit—macrophage; 
CFU-G, colony-forming unit—granulocyte; BFU-E, burst-forming 


unit—erythrocyte. d, GO analysis of differentially expressed genes after 
PRMTS5 deletion in fetal liver cells (n = 4). e, Bone marrow cellularity of adult 
mice after tamoxifen injection (n = 3 in each group). f, Methocult M3434 
colony formation assay using bone marrow cells from e. g, Whole blood counts 
(WBC) 5 days after PRMT5 deletion in vivo (n = 5). NE, neutrophils; LY, 
lymphocytes; MO, monocytes; EO, eosinophils; BA; basophils; RBC, red blood 
cells. h, Bone marrow cellularity after selective deletion of PRMT5 in the 
haematopoietic system (n = 4 mice in each group). The data presented in this 
figure are the average and s.d. Student’s t-test (two-sided) was used; *P < 0.05, 
**P< 0.01. 
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Extended Data Figure 9 | Antisense oligonucleotides targeting ATR, EP400 
and DVL1 lead to the reduction of their full-length protein levels. a, Sashimi 
plots showing alternatively spliced transcripts of DVL1, ATR and EP400 after 
PRMT5 depletion in Eu-myc;Prmt5”” *CreER cells. b, Quantification of protein 
expression after electroporation of Eu-myc B cells with the respective ASOs 

(n = 3). c, Upper panel: schematic representation of ASOs designed to block the 


intron retention in Dvl1 induced by PRMT5 knockout. Middle panel: 
validation of efficacies of ASOs in reversing the alternative splicing of Dvl/, after 
PRMT5 knockout (1 = 3). Bottom panel: cell viability of E-myc B cells 2 days 
after the electroporation with the respective ASOs. The data presented in this 
figure are the average and s.d. Student's t-test (two-sided) was used; *P < 0.05, 
**P< 0.01. 
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Extended Data Figure 10 | Graphical summary. Top panel: black lines 
indicate MYC direct transcriptional upregulation of PRMT5 and other 
components of the core snaRNP assembly machinery, which ensures splicing 
fidelity. Bottom panel: red arrows indicate the perturbation of the 
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AS events caused by 
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MYC/PRMT5 axis: 
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AS events results 
in deregulation of: 


Weak 5’-Donor 
Cell cycle arrest 
Apoptosis 


MYC-PRMTS5 axis, which leads to a reduction in splicing fidelity within the 
cell, skipped exons and retained introns of genes, such as Ep400, Dvl1 and Atr 
(which harbour exons with weak 5'-donor sites), downregulation of their 
protein levels and, consequently, cell-cycle arrest and apoptosis. 
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Cytosolic extensions directly regulate a rhomboid 
protease by modulating substrate gating 


Rosanna P. Baker! & Sini$a Urban! 


Intramembrane proteases catalyse the signal-generating step of 
various cell signalling pathways, and continue to be implicated 
in diseases ranging from malaria infection to Parkinsonian neuro- 
degeneration’*. Despite playing such decisive roles, it remains 
unclear whether or how these membrane-immersed enzymes 
might be regulated directly. To address this limitation, here we 
focus on intramembrane proteases containing domains known to 
exert regulatory functions in other contexts, and characterize a 
rhomboid protease that harbours calcium-binding EF-hands. We 
find calcium potently stimulates proteolysis by endogenous 
rhomboid-4 in Drosophila cells, and, remarkably, when rhomb- 
oid-4 is purified and reconstituted in liposomes. Interestingly, 
deleting the amino-terminal EF-hands activates proteolysis prema- 
turely, while residues in cytoplasmic loops connecting distal trans- 
membrane segments mediate calcium stimulation. Rhomboid 
regulation is not orchestrated by either dimerization or substrate 
interactions. Instead, calcium increases catalytic rate by promoting 
substrate gating. Substrates with cleavage sites outside the mem- 
brane can be cleaved but lose the capacity to be regulated. These 
observations indicate substrate gating is not an essential step in 
catalysis, but instead evolved as a mechanism for regulating pro- 
teolysis inside the membrane. Moreover, these insights provide 
new approaches for studying rhomboid functions by investigating 
upstream inputs that trigger proteolysis. 

Cell membranes are both controlled borders with the outside world 
as well as dynamic platforms for organizing cell signalling, metabolic 
pathways, and ultrastructure assembly. All of these key events rely on 
enzymes that reside directly within the cell membrane, yet achieving a 
mechanistic understanding of how these specialized enzymes function 
within this environment has proved challenging. 

Intramembrane proteases catalyse the committed, signal-generat- 
ing step of several key signalling pathways by cleaving transmembrane 
proteins within the membrane’. Their importance is emphasized by 
repeated implication in disease. y-Secretase generates the amyloid-B 
peptide in Alzheimer’s disease*°, but more recently has been success- 
fully targeted in a spectrum of cancers’, because its activating cleavage 
of the Notch receptor triggers signalling”. Site-2 protease family metal- 
loenzymes liberate transcription factors from the membrane to control 
cholesterol and fatty-acid composition of membranes’, and signalling 
circuits that control virulence in pathogenic bacteria’. Rhomboid ser- 
ine proteases are a family of master regulators that initiate epidermal 
growth factor signalling during Drosophila development*’, but more 
recently have been implicated in cleaving adhesins during malaria 
invasion’, and regulating mitochondrial quality control to guard 
against Parkinson’s disease’®. 

Since peptide bond cleavage is irreversible in the cell, precise regu- 
lation of protease activity is paramount. Yet it is generally thought that 
intramembrane proteases are constitutively active enzymes over which 
the cell cannot exert direct regulation'’. Instead, two mechanisms 
control activity. The first is transcriptional, as exemplified by 
Drosophila rhomboid-1: the constitutively active protease is made only 
when and where needed’. This mechanism has historically served as a 


beautiful atlas of epidermal growth factor signal initiation during 
development. The second mechanism is centred on controlling access 
to substrate by segregating it from protease’’. Malaria, for example, 
sequesters adhesins in secretory organelles before invasion, while their 
secretion onto the surface leads to the first encounter with an active 
rhomboid protease’. 

The key property missing from these two mechanisms is the ability 
to respond rapidly to changing conditions: transcriptional and cell 
localization changes are ill adapted to provide immediate responses 
that are hallmarks of cell signalling. Moreover, it is essentially unpre- 
cedented for proteases to be devoid of direct enzymatic regulation in 
the cell, raising the possibility that this apparent discrepancy reflects 
our lack of understanding rather than absence of a regulatory mech- 
anism. 

Although Escherichia coli rhomboid protease GlpG has served as a 
tractable model for studying the structure—function of intramembrane 
proteolysis’’, no information is available on its cellular role. This 
knowledge gap prohibits deciphering regulatory mechanisms. 
Instead, as a new approach to this question, we searched for rhomboid 
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Figure 1 | Calcium rapidly stimulates intramembrane proteolysis in 
Drosophila cells by endogenous DmRho4. a, Diagram comparing the 
predicted calcium-binding loop residues of DmRho4 to an EF-hand consensus 
(in red). b, Calcium ionophore treatment of Drosophila S2R° cells induced 
cleavage of GFP-Spitz, but not its cleavage-site mutant, by endogenous 
DmRho4. Graph shows expression levels of Drosophila rhomboid genes in 
S2R* cells (RNA-seq data from modENCODE, http://modencode.org). AU, 
arbitrary units. c, lonophore-induced Spitz cleavage was detectable within 

5 min (red triangle) and linear for 3 h. d, RNAi knockdown of DmRhoé4 but not 
of DmRhol abrogated calcium-induced cleavage of GFP-Spitz. e, Plasmid 
expression of DmRhoé4 rescued calcium-induced cleavage of GFP-Spitz in 
S2R* cells undergoing RNAi. f, Calcium-stimulated Spitz cleavage abolished by 
DmRho4 RNAi could not be rescued by DmRhol overexpression. All images 
are anti-GFP western analyses, with substrate and cleavage bands denoted by 
black or open triangles, respectively, and non-specific bands marked by ‘x’ 
(see Fig. 3d for untransfected cells). 
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Figure 2 | Calcium directly regulates intramembrane proteolytic activity of 
DmRho4. a, Proteolysis assay with pure reconstituted DmRho4 with or 
without a panel of 1 mM divalent metal ions (upper panels), and four different 
rhomboid enzymes reconstituted into proteoliposomes with or without 
calcium (lower panel). EcGlpG is from E. coli, PsAarA is from Providencia 
stuartii, and VcRhol is from Vibrio cholerae. b, Analysis of calcium binding to 
DmRho4 in proteoliposomes by isothermal titration calorimetry (upper graph 
shows the thermograms, lower graph is the liposome-subtracted 
quantification). c, Calcium titration analysis of DmRhoé4 proteolysis using an 
inducible real-time reconstitution assay’®. Black dashed line shows an 
alternative fit with an optimal Hill coefficient. RFU, relative fluorescence units. 
d, Titration of wild-type (WT) and AEF DmRho4 in $2R™ cells comparing 
basal, unstimulated Spitz cleavage (left panel), and Spitz cleavage by DmRho4- 
AEF with or without calcium ionophore (right panel; see also Extended Data 
Fig. 2a). Lower graph shows in vitro activity of wild type and EF-hand mutants 


proteins that contain additional domains with precedent for regulating 
protein activity and focused on a conserved subset of over two dozen 
animal rhomboid enzymes with EF-hand domains appended to their 
cytosolic amino (N) termini (Fig. la and Extended Data Fig. 1). EF- 
hands are helix-loop-helix motifs in which calcium binding at the 
loop serves either a structural or a regulatory role. In the latter, calcium 
binding separates the helices and exposes a new surface for binding a 
regulatory partner’’. EF-hands typically occur in pairs to form a stable 
helical bundle, perhaps the best characterized of which are the EF- 
hands of calmodulin”. 

Since rhomboid function is best understood in Drosophila, we 
sought to study the EF-hand containing Drosophila rhomboid-4 
(DmRho4) under physiological conditions by searching for cell lines 
that endogenously express DmRho4. We focused on the well-charac- 
terized S2R* cell line, which also expresses the housekeeping mito- 
chondrial rhomboid and low amounts of DmRhol, but no other 
thomboid (Fig. 1b). Treating S2R* cells with a calcium ionophore 
potently stimulated processing of the epidermal growth factor ligand 
Spitz by more than 50-fold, but not its transmembrane mutant 
(Fig. 1b), and processing was rapid, becoming detectable within 
5 min (Fig. lc). Targeting DmRho4 with RNA interference (RNAi) 
removed this stimulation completely, while parallel RNAi against 
DmRhol! had no effect whatsoever (Fig. 1d). Importantly, exogenous 
expression of DmRho4 lacking the RNAi target sequence fully rescued 
calcium stimulation (Fig. le). Finally, calcium activated DmRho4 pro- 
teolysis directly and not other steps such as Spitz trafficking, because 
even high levels of DmRhol could not substitute for removing 
DmRho4 during calcium stimulation (Fig. 1f). Calcium therefore 
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of DmRho4 in proteoliposomes (error bars, s.d. for experimental replicates). 
e, Topology of 3X HA-~DmRho4-Flag in S2R* cells as assessed by 
deconvolution immunofluorescence. The N-terminal HA-tag (red) was 
inaccessible while the C-terminal Flag tag (green) was accessible in the absence 
of detergent, indicating that the N terminus is cytosolic while the C terminus of 
DmRho4 is extracellular (blue marks nuclei). f, Ability of DmRho4 cytosolic 
loop mutants to cleave GFP-Spitz in response to calcium ionophore 
stimulation in S2R~ cells was quantified by anti-GFP western analysis (see also 
Extended Data Fig. 2b for DmRho4 levels). Graphs show activity of selectively 
compromised loop 4 and 6 mutants under calcium-stimulated conditions in 
cells (upper graph) versus unstimulated conditions (lower graph, measured as 
cleavage product accumulation in culture media after 24h; see also Extended 
Data Fig. 2c). Error bars, s.d. for experimental replicates. Filled and open 
triangles denote substrate and cleavage bands, respectively, throughout. 


triggers potent and rapid proteolysis by DmRho4 under physiological 
conditions. 

Our goal was to study direct regulation of rhomboid enzymes. We 
therefore next tested the unlikely possibility that calcium directly reg- 
ulates pure DmRho4 expressed and purified from bacteria and recon- 
stituted into liposomes. Remarkably, addition of calcium directly 
stimulated intramembrane proteolysis of DmRho4 by more than ten- 
fold, but not a panel of other rhomboid proteases, without any other 
protein factors present (Fig. 2a). This stimulation was selective since 
other divalent metal ions failed to have this effect. 

We used isothermal titration calorimetry to characterize the ther- 
modynamic basis of calcium binding to the EF-hand domain (Fig. 2b). 
The resulting thermograms revealed two sites for calcium binding with 
similar dissociation constants, Kg, of 2.1 + 0.05 uM. Interestingly, a 
calcium titration revealed stimulation of DmRho4 protease activity 
(Fig. 2c) could be fitted with a binding isotherm with an apparent 
Kg of 112 + 7.4M, implying that calcium binding to at least a third, 
lower affinity site (beyond detection by isothermal titration calori- 
metry) is required for enzyme activation. To assess this further, we 
deleted the EF domain entirely (AEF), and found DmRho4 became 
dysregulated in cells, with its basal activity elevated approximately 
tenfold (Fig. 2d). This effect was direct because pure DmRho4AEF 
also displayed elevated activity in vitro. However, calcium still stimu- 
lated proteolysis of the E93A+E130A mutant that abrogates binding 
of calcium to EF-hands, DmRho4AEF, and DmRho4AN missing its 
entire N terminus (Fig. 2d and Extended Data Fig. 2a). These observa- 
tions further indicate that another calcium-binding site outside its 
N-terminal domain is the basis of DmRho4 stimulation, while the 
EF-hand domain functions to limit activation. 
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Figure 3 | Intermolecular interactions do not mediate DmRho4 regulation 
by calcium. a, Anti-Flag co-immunoprecipitation analysis of catalytically 
inactive Flag-DmRho4-H358A and GFP-Spitz from S2R* cells untreated or 
treated with calcium ionophore. b, Effect of calcium on the steady-state kinetic 
parameters of intramembrane proteolysis by reconstituted DmRho4 

(mean + s.d. of three independent experiments, compared using a paired 
t-test). c, Anti-Flag co-immunoprecipitation of Flag-DmRho4 and HA- 
DmRhoé4 co-expressed in S2R™ cells untreated or treated with calcium 
ionophore. d, Overexpressing catalytically inactive DmRho4 (H358A) had no 
effect on the calcium-stimulated activity of endogenous DmRho4 in S2R* cells 
(compare cleavage bands, denoted by open triangle, in lanes 3 versus 4, and 
quantified in graph on right). Expressing low levels of wild-type 3x HA- 
DmRhoé (1/1,000 amount of input plasmid) was used to quantify the level of 
endogenous DmRhoé4 (by comparing protease activity) relative to 3x HA- 
DmRho4—H358A expression (by comparing anti-HA signals). UN indicates 
S2R* cells not transfected with GFP-Spitz (‘x denotes non-specific bands). 


To begin mapping the calcium-binding site(s) responsible for 
stimulating proteolysis, we first examined the topology of DmRho4 
in $2R* cells (Fig. 2e). An antibody accessibility approach revealed the 
N terminus of DmRhoé4 resides in the cytosol while the carboxy (C) 
terminus is extracellular, indicating that loops 2, 4, and 6 are cytosolic 
and thus candidates for calcium binding. We mutated all 24 residues in 
these three cytosolic loops to alanine and assessed their activity in 
Drosophila cells (Fig. 2f and Extended Data Fig. 2b, c). The mutants 
fell into three classes: most did not affect activity, three perturbed both 
stimulated and unstimulated activity and were therefore probably 
structural mutants, and seven selectively compromised calcium-sti- 
mulated proteolysis but not DmRhoé4 structure (Extended Data Fig. 
2d), as revealed in a sensitive thermostability assay’*. The latter loca- 
lized to loops 4 and 6, indicating that calcium binding at a site formed 
by loops 4 and 6 specifically activates the intramembrane enzyme core, 
but how? 

We used complementary biochemical and enzymatic approaches to 
decipher the mechanism of calcium stimulation. Although calcium 
binding could expose an exosite for substrate binding’*, we did not 
detect any increased interaction between DmRho4 and substrate in the 
presence of calcium by co-immunoprecipitation analysis either from 
Drosophila cells (Fig. 3a) or from proteoliposomes (Extended Data Fig. 
3a). This was also functionally corroborated by kinetic analysis using 
an inducible real-time reconstitution assay for studying catalysis 
directly within the membrane"; calcium stimulated the catalytic rate 
of DmRho4 at least sixfold, but did not increase affinity of protease for 
substrate (Fig. 3b). 

Recently rhomboid proteases have been postulated to exist as 
dimers’’. In contrast to this model, co-expressing two DmRho4 mole- 
cules carrying different epitope tags under calcium signalling condi- 
tions in Drosophila cells (Fig. 3c) or co-reconstituted into 
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proteoliposomes (Extended Data Fig. 3b) did not result in dimeriza- 
tion. Functionally, we also did not observe calcium stimulation of 
protease activity in trans, which is a classical test of allostery resulting 
from oligomerization: a catalytically inactive rhomboid enzyme that 
can still bind calcium could not stimulate the activity of a DmRho4 
enzyme carrying a mutation that compromised calcium binding 
(Extended Data Fig. 3c). 

These observations collectively suggest that DmRho4 is not regu- 
lated by interaction with any other proteins. To test this further in 
Drosophila cells, we overexpressed inactive DmRho4 to outcompete 
the endogenous enzyme for any binding partners (Fig. 3d). 
Remarkably, about 200-fold more inactive DmRho4 had no effect 
whatsoever on the ability of the endogenous DmRho4 to process 
Spitz. This observation, in particular, indicates that rhomboid regu- 
lation under physiological conditions is not mediated by dimerization, 
substrate affinity, or additional factors (although factors that fine tune 
responses in different contexts remain possible). 

In contrast to intermolecular target binding, calcium could directly 
stimulate the activity of a single DmRho4 enzyme through intramo- 
lecular allostery. Calpains are the precedent for this type of activation, 
with calcium binding resulting in a conformation change that aligns 
the catalytic residues'*. Since the structure of a eukaryotic rhomboid 
enzyme has never been solved, we used a biochemical cross-linking 
approach to test whether calcium aligns the catalytic residues of 
DmRho4. Cysteines installed at the catalytic serine and histidine posi- 
tions had no effect on the structural stability of DmRho4 (Extended 
Data Fig. 4a) and could readily and reversibly be oxidized to form a 
disulphide bridge (Fig. 4a). Importantly, calcium did not affect the 
amount of cross-linking, revealing that the DmRho4 catalytic residues 
are pre-aligned with no influence from calcium binding. 

Stimulation of the catalytic rate constant k,., by calcium was strik- 
ingly reminiscent of the increase in k,,, we measured for gate-open 
mutants of GlpG"*. This is an attractive parallel, because gate-opening 
is the rate-limiting step for rhomboid intramembrane proteolysis, and 
the loops that we predict bind calcium also connect the presumed 
transmembrane gating helix to the rest of the enzyme. One con- 
sequence of gate-opening is that substrates can enter deeper into the 
protease active site, which is reflected in a shift of cleavage site deeper 
into the substrate transmembrane segment’’. Accordingly, calcium- 
stimulated proteolysis shifted the cleavage site 3 residues deeper into 
the transmembrane segment for DmRho4 but not other rhomboid 
proteases, consistent with calcium specifically stimulating DmRho4 
proteolysis by facilitating gate-opening (Fig. 4b). 

To explore the functional consequence of this shift, we examined 
proteolysis of a series of transmembrane substrates that we engineered 
to have cleavage sites inside the membrane, outside the membrane, or 
both in the same molecule. Remarkably, the external site was used very 
well in the absence of calcium, while the intramembrane site was barely 
cleaved (Fig. 4c and Extended Data Fig. 4b). Moreover, loop 4 and 6 
mutants that compromised calcium regulation in cells also readily 
cleaved the external site (Extended Data Fig. 4c), which independently 
confirms that the catalytic residues are competent for catalysis in the 
absence of calcium. However, addition of calcium shifted the cleavage 
site from the external site almost exclusively to the intramembrane site 
for DmRho4 while having no effect on the cleavage site selection of 
other rhomboid enzymes (Fig. 4c and Extended Data Fig. 4d). 

Although no information is available on how E. coli GlpG is regu- 
lated, we also extended these analyses to this widely studied enzyme by 
comparing wild type with gate-open mutants”. Both wild-type and 
gate-open GlpG cleaved the external site with similar efficiency 
(Fig. 4c), while proteolysis at the intramembrane site was specifically 
stimulated by gate-open mutants (Fig. 4d). Moreover, although gate- 
open mutants of E. coli GlpG stimulated proteolysis of transmembrane 
substrates by approximately tenfold (Fig. 4d), a soluble casein substrate 
that approaches the active site from above (not laterally from the 
membrane) was not proteolysed at any higher level (Fig. 4d). 
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gating. a, Calcium did not enhance disulphide crosslinking of cysteines 
installed at the catalytic serine and histidine positions of DmRho4 (triangles 
indicate crosslinked products). b, Shift in substrate cleavage site generated by 
reconstituted DmRho4 with or without calcium as analysed by mass 
spectrometry following anti-Flag immunoaffinity capture. Masses and triangles 
indicating cleavage sites are colour-matched. c, Processing of a substrate 
carrying intramembrane (blue) and external (orange) cleavage sites in APP- 
Flag that was co-reconstituted into liposomes with DmRho4, and assayed with 
or without calcium. Reactions were analysed by mass spectrometry (top) and 
quantitative western blotting (lower panels). Processing of a substrate carrying 
only the external cleavage site was compared for DmRho4 with or without 
calcium, and for EcGlpG versus its gate-open mutant W236G. d, Quantitative 
cleavage analysis of wild type and gate-open mutants of EcGlpG for the APP- 
Spi7-Flag transmembrane substrate and a soluble BODIPY-casein substrate. 
Full-length substrate (filled triangles) and cleavage products (open triangles) 
are indicated. 


Therefore, cleavage outside the membrane by rhomboid proteases is 
readily possible, but this type of cleavage is difficult to regulate. A 
similar analysis of DmRho4 will require identifying gate-open 
mutants, which should no longer respond to calcium. Lack of a 
DmRhoé structure and <12% sequence identity with GlpG has hin- 
dered this approach. 

In summary, we discovered that calcium binding directly, rapidly, 
and potently stimulates DmRho4 intramembrane proteolysis by facil- 
itating substrate access to the internal active site (i.e. gating). In addi- 
tion to revealing, to our knowledge, the first mechanism for directly 
regulating any intramembrane protease, these observations resolve the 
long-standing mystery of why these proteases gate substrate entry. 
Gating has been controversial, because rhomboid proteases can be 
made to process substrates without lateral gate-opening”, and cleav- 
age can be moved to external sites in substrates”. In doing so, it was 
questioned why intramembrane gate-opening occurs, if at all. Our 
observations now reveal that intramembrane substrate gating is a 
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means of enzyme regulation, not a necessary step in catalysis: cleavage 
outside the membrane is possible, but cannot be tightly regulated 
directly. 

The ability of cytosolic regions to regulate substrate gating at a 
distant, intramembrane site also suggests deeper organizational fea- 
tures within rhomboid architecture that are only beginning to be 
studied”*. Interestingly, the ‘non-canonical’ mode of calcium binding 
to DmRho4 loops is reminiscent of synaptotagmin activation, which 
also involves loops that are placed into close contact with lipid mole- 
cules. In fact, rhomboid activation may also involve lipids, which 
might explain the apparent high calcium levels needed for full enzyme 
activation in vitro: synaptotagmin exhibits an intrinsic Kg of 530 uM 
for calcium at the C2 site that decreases to 3-4 [1M when appropriate 
lipids are present™. Although ultimately structural analysis is required 
to reveal the precise architecture of calcium binding, likely involve- 
ment of lipid, and impact on gating in DmRho4, so far no rhomboid 
enzyme with an intact extramembranous domain has produced well- 
diffracting crystals. Our studies provide incentive to move beyond 
GlpG and focus structural biology efforts on these more complex 
rhomboid proteases. 

A particularly exciting implication of these enzymatic properties is 
that rhomboid proteases can directly integrate upstream signals from 
other signalling pathways. In fact, this may have medical implications, 
since Ventrhoid/RHBDL3, a human rhomboid that contains potential 
calcium-binding residues in its EF-hands and cytoplasmic loops, is 
expressed in the nervous system” and may be linked to a mental 
retardation syndrome”®. In this light, studying upstream regulation 
provides a powerful new approach towards revealing biological func- 
tions of rhomboid proteases that have evaded discovery. In fact, pre- 
vious efforts could have missed important roles because they were 
studying rhomboid functions under unstimulated conditions. 
Activation is not limited to calcium signalling, since a diversity of 
recognizable domains have been appended to different rhomboid pro- 
teins including zinc fingers, B-propellers, and tetracopeptide repeats”’. 
It should be noted that not all extramembranous domains necessarily 
serve direct regulatory functions. For example, trafficking signals have 
been found in the cytosolic domains of parasitic rhomboid enzymes”. 

Finally, while we focused our studies on rhomboid proteases, sub- 
strate gating has been proposed for other intramembrane prote- 
ases”*°, raising the possibility that gating may be a general 
mechanism for directly regulating intramembrane proteolysis. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 

Rhomboid expression constructs. The Drosophila rhomboid-4 open reading 
frame (ORF) was subcloned from SD06923 (Drosophila Genomics Research 
Center) into pGEX-6P-1 or pET21 for bacterial expression, or into pRmHa3 for 
expression in Drosophila S2R* cells. An N-terminal 3X HA-tag was introduced to 
allow detection of DmRho4 by anti-HA western analysis. Residue substitutions 
were introduced by site-directed mutagenesis using Pfu Ultra in a Stratagene 96 
Gradient Robocycler (Agilent Technologies) and were confirmed by sequencing 
the entire ORF. The EF-hand domain of DmRho4 (residues 68-154) or its entire 
N-terminal domain (residues 1-176) were deleted by site-directed mutagenesis to 
generate the AEF and AN mutants, respectively. 

Drosophila cell culture and manipulation. Drosophila $2R* cells (purchased 
from the Drosophila Genomics Research Center) were cultured at 25°C in 
Schneider’s insect medium (Invitrogen) supplemented with 10% fetal bovine 
serum (FBS, Sigma). For transfection experiments, S2R™ cells were seeded into 
six-well plates and transfected with a total of ~2 ig of plasmid DNA (0.5 tg each 
of pRmHa3-GFP-Spitz and pRmHa3-Star plus 1 j1g pBluescript, and 5-25 ng of 
pRmHa3-3x HA-DmRho4 if applicable) and 5 pl of XtremeGENE HP (Roche). 
Transfection complexes were formed at room temperature in 100 pl of DMEM for 
15 min before being added dropwise to cells. Expression from the metallothionine 
promoter was induced the following day with 0.5mM CuSO, for ~24 h. For 
ionophore experiments, cells were then washed with insect saline (10mM 
HEPES pH 7.3, 120mM NaCl, 5mM KCl, 32 mM sucrose, 8mM MgCl, 2mM 
CaCl), and incubated with 6 uM ionomycin (Sigma) in insect saline for 2-3h, 
unless otherwise indicated. Cells were lysed in reducing Laemmli buffer, the lysates 
resolved on 4-20% tris-glycine SDS polyacrylamide gels, and electrotransferred to 
nitrocellulose. GFP-Spitz (or its transmembrane mutant SGA—>LLL) and 3X HA- 
DmRho4 were detected by anti-GFP and anti-HA western analysis, respectively, 
and fluorescence quantified with an Odyssey infrared laser scanner (LiCor 
Biosciences). The protein level of endogenous DmRho4 in S2R™ cells was esti- 
mated by titrating wild-type 3x HA~-DmRhoé4 by transfection: matching the level 
of transfected 3xHA-DmRho4 to the endogenous DmRho4 expression level 
resulted in a doubling of GFP-Spitz processing (Fig. 3d). To detect basal (unstimu- 
lated) DmRhoé4 activity, Schneider’s serum-free media was conditioned for 24h, 
and the GFP-Spitz that had been released by proteolysis was quantified by anti- 
GFP western analysis of media fractions. 

Drosophila cell RNAi. Templates for in vitro transcription were amplified by 
PCR with primers corresponding to the divergent N-terminal regions of DmRhol 
(ORE nucleotides 1-451) and DmRho4 (ORF nucleotides 1-501) and incorpo- 
rated T7 promoter sequences. RNA was generated using the RiboMAX T7 kit 
(Promega) according to the manufacturer’s instructions, and purified using the 
RNeasy protocol (Qiagen). Double-stranded RNA (dsRNA) was formed by mix- 
ing equal amounts of each strand in annealing buffer (1mM HEPES pH 7.3, 
0.5mM EDTA) and boiling for 5 min, followed by slow cooling. dsRNA was 
analysed by agarose gel electrophoresis. S2R* cells seeded in six-well plates were 
washed with serum-free Schneider’s media, and ~25 1g of dsRNA was added to 
each well containing 1mL of serum-free Schneider’s insect media. After a 1-2h 
incubation, 3 ml of Schneider’s media + 10% FBS was added, and the cells were 
incubated at 25 °C for 3 days. On the third day, cells were assayed by transfection 
as described above. 

Drosophila cell microscopy. S2R* cells were seeded onto glass coverslips, trans- 
fected with 3x HA~-DmRho4-1 Flag (which was verified to be proteolytically 
active and calcium regulated), induced with 0.5 mM CuSO, for ~24h, and fixed 
in 4% formaldehyde in PBS for 20 min. Cells were blocked with 1% bovine serum 
albumin in the presence or absence of 0.1% TritonX-100, and stained with 1/200 
anti-HA and anti-Flag antibodies overnight. The resulting immune complexes 
were detected with 1/500 secondary antibodies conjugated to Alexa-488 or 
Alexa-594 (Molecular Probes), mounted in the presence of DAPI, and imaged 
on a DeltaVision Elite deconvolution microscope (GE Healthcare) using 0.2 um 
optical sections with a 1.4 numerical aperture X 100 objective lens. 
Recombinant protein production. Wild-type or engineered variants of DmRho4 
were expressed as N-terminal GST- or His-tag fusions in E. coli BL21(DE3) cells. 
Cultures were grown in M9* minimal media containing 100 yg ml ' ampicillin at 
37 °C, shaking at 250r.p.m. When cultures reached an absorbance at 600 nm of 
0.5, protein expression was induced with the addition of 50 uM IPTG for 16-18h 
at 16 °C. Bacterial cell lysates were prepared using a French pressure apparatus 
operating at 10,000 p.s.i. and cell membranes were pelleted by ultracentrifugation 
at 350,000g for 30 min. DmRho4 was solubilized from membranes in 2% dodecyl- 
B-p-maltoside (DDM) for 1h at 4 °C, followed by ultracentrifugation at 350,000g 
for 30 min to remove insoluble material. GST-DmRho4 was affinity-purified on 
glutathione-sepharose and eluted by PreScission protease (GE Healthcare) cleav- 
age of the GST-tag at 4°C overnight, while His-Rho4 was affinity-purified with 
Ni-NTA (Qiagen) or His-tag resin (Roche) and eluted using imidazole as 


recommended by the manufacturers. The N-terminal His-tag was removed by 
thrombin cleavage at 4 °C overnight. Bacterial rhomboid proteases were expressed 
as N-terminal GST fusion proteins and purified by glutathione-sepharose affinity 
as described above. The APP-Spi7-Flag substrate was produced as described 
previously'**'**. Briefly, expression of APP-Spi7—Flag was induced for 2-3h at 
37°C in BL21(DE3) cells with 1mM IPTG. Cells were lysed using a French 
pressure apparatus and lysates were cleared of cellular debris by centrifugation 
at 3,000g for 10 min. The substrate was affinity purified using anti-Flag M2 agarose 
(Sigma). A variant of APP-Spi7—Flag with an additional rhomboid cleavage site 
outside the transmembrane domain was constructed by inverse PCR to substitute 
residues 20-25 (FFAEDV) of APP-Spi7-Flag with the sequence IATAAF from P. 
stuartii TatA. APP-TatA-Flag had only the external TatA site. 

Intramembrane proteolysis assays. Rhomboid proteases were co-reconstituted 
with substrates into liposomes using the inducible reconstitution system that we 
described recently’®. Bacterial rhomboid proteases were reconstituted in liposomes 
formed from an E. coli polar lipid extract, while DmRho4 was reconstituted in 
liposomes formed from a yeast polar lipid extract or 1-palmitoyl-2-oleoyl-sn- 
glycero-3-phosphocholine lipids (Avanti Polar Lipids). All enzymes were assayed 
for 1h at 37°C, except DmRho4, which was assayed at 25°C for 2-4h. 
APP-+ Spi7-Flag reaction products were resolved by SDS-polyacrylamide gel elec- 
trophoresis (SDS-PAGE) and quantified by anti-Flag western analysis using an 
Odyssey infrared laser scanner (LiCor Biosciences), while products for the real- 
time assay using fluorescein isothiocyanate (FITC)-TatA were quantified using a 
Synergy H4 Hybrid plate reader (Biotek) scanning once per minute. Proteolysis 
assays were supplemented with 0.5 mM calcium unless otherwise indicated. For 
calcium titration experiments, 10 pmol of DmRho4 and 200 pmol FITC-TatA 
were co-reconstituted into 30 jg yeast liposomes and total calcium was titrated 
from 0 to 1 mM. For kinetic analysis, 10 pmol of Rho4 was titrated against 15- 
600 pmol FITC-TatA substrate in the presence or absence of 0.5 mM calcium; 
initial reaction rates were extracted and fitted to a Michaelis-Menten model using 
Prism software. 

Casein cleavage assay. Wild-type GlpG and engineered variants were assayed in 
reactions consisting of 6, 12, or 24pmol enzyme and 50pgml_’ BODIPY FL 
casein (Life Technologies) in 50mM Tris pH 7.4, 150mM NaCl and 0.1% 
DDM for 1h at 37°C. Reactions were quenched by adding an equal volume of 
2x Tricine sample buffer, resolved by SDS-PAGE on 16% Tricine gels (Life 
Technologies) and imaged using a Typhoon Imager (GE Healthcare) with settings 
of 488 nm for excitation and 526 nm for emission. 

Isothermal titration calorimetry. Isothermal titration calorimetry used a 
Microcal iTC 99 instrument (GE Healthcare). The reaction cell contained 20- 
301M DmRho4 in proteoliposomes and the reference cell contained water. A 
motorized syringe loaded with 1mM calcium was used to perform successive 
2 ul injections into the reaction cell at 25 °C. Control experiments titrating calcium 
against liposomes were performed to determine the heat of titrant dilution, which, 
subtracted from the heat of reaction, yielded the effective heat of calcium binding. 
Data were fitted using Origin analysis software. 

Thermostability analysis. Wild-type and engineered variants of DmRho4 were 
subjected to quantitative thermostability analysis as described previously“. Briefly, 
pure DmRhoé4 was diluted to 5 uM in 50 mM Tris pH 7.4, 150mM NaCl, 0.1% 
dodecyl-f-b-maltoside, 0.5% or 1.2% nonyl-glucoside, heated from 25 °C to 85 °C 
at a rate of 0.2°C per minute, and the differential static light scattering was 
quantified every 0.5°C in a Stargazer-384 instrument (Harbinger Biotech). 
Light scattering data were fitted to a two-state Boltzmann curve using Stargazer 
BioActive software to derive transition temperature midpoints (T,,). 
Rhomboid-substrate co-immunoprecipitation. S2R™ cells were co-transfected 
with catalytically inactive pRmHa3-Flag-DmRho4-H358A and pRmHa3-GFP- 
Spitz or, as a negative control, with pRmHa3-GFP-Spitz alone, and were either 
untreated or treated with 6 tM ionomycin in insect saline or Ca-free insect saline, 
as described above. Cell lysates were solubilized for 1h at room temperature in 
25mM Tris pH 7.4, 150mM MgCl,, Complete EDTA-free protease inhibitor 
cocktail (Roche), and 0.25% DDM in either the presence of 1mM CaCl, or 
1mM MgCl). Cell debris was removed by centrifugation at 16,000g for 20 min 
at 4°C. Immunoprecipitations were done using anti-Flag M2 agarose for 1h at 
room temperature. Beads washed in the presence of 1 mM CaCl, or 1mM MgCl, 
were resuspended in SDS sample buffer, then load and bound fractions were 
resolved by SDS-PAGE followed by anti-Flag/anti-GFP western analysis. For 
co-immunoprecipitation analysis in proteoliposmes, catalytically inactive HA- 
tagged DmRho4 (S299A) was co-reconstituted with APP-Spi7-Flag as described 
above in buffer consisting of 50mM Tris pH 7.5, 150mM NaCl, and either no 
calcium or 0.5 mM CaCl,, Proteoliposomes were incubated for 1h at room tem- 
perature, solubilized with 1% DDM for 30 min at room temperature, and then 
immunoprecipitation was performed with anti-Flag M2 agarose for 1h at room 
temperature in the presence or absence of calcium. Washed beads were 
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resuspended in SDS sample buffer and the input and bound fractions were com- 
pared by anti-HA and anti-Flag western analysis. 

Rhomboid co-immunoprecipitation. S2R* cells were co-transfected with 
expression constructs encoding triple HA-tagged and Flag-tagged DmRho4, and 
as a control, with HA-tagged DmRho4 alone as described above, and were then 
either untreated or treated with 6 1M ionomycin in insect saline or Ca-free insect 
saline. Cells were solubilized with 0.25% DDM (as described above) and immu- 
noprecipitations were performed with anti-Flag M2 agarose for 1 h at room tem- 
perature. Beads washed in the presence of 1mM CaClor 1mM MgCl, were 
resuspended in SDS sample buffer. Load and bound fractions were detected by 
anti-HA and anti-Flag western analysis. Pure samples of HA-tagged and Flag- 
tagged DmRho4 were also co-reconstituted into proteoliposomes in the absence or 
presence of 0.5mM CaCl,, then solubilized with 1% DDM for 30 min at room 
temperature, immunoprecipitated with anti-Flag M2 agarose (as described above) 
and then subjected to anti-HA/anti-Flag western analysis. 

Disulphide cross-linking. A cysteine-less mutant of DmRho4 was generated by 
substituting the three amino-terminal native cysteine residues with serine residues 
(C1048, C150S, C176S) and the native transmembrane cysteine residue with a 
valine residue (C334V). Using this construct as a template, the active site residues 
$299 and H358 were replaced with cysteine residues, either singly, to generate 
$299C and H358C, or in combination, to generate the double mutant $299C/ 
H358C. Pure cysteine-substituted proteins were reduced by treatment with 
5 mM TCEP for 30 min at room temperature and then passed through a Zeba spin 
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column (Pierce) equilibrated in 50 mM Tris, 150 mM NaCl, 0.1% DDM. Proteins 
were then oxidized, either in detergent micelles or after reconstitution into proteo- 
liposomes, by the addition of 50 or 100 tM copper phenanthroline, respectively, for 
15 min at room temperature in the absence or presence of 0.5 mM calcium. Note 
that owing to the random orientation of rhomboid in reconstituted proteolipo- 
somes, only half of the DmRho4 was accessible to the oxidizing reagent. Control 
reactions were done in parallel with no copper but with 50 mM DTT to prevent 
spontaneous oxidation. Reactions were stopped by the addition of SDS-sample 
buffer. As an additional control, the oxidation observed for the double mutants was 
reversed by the addition of 50 mM DTT for 10 min to the oxidized proteins in SDS 
sample buffer. Samples were analysed by SDS-PAGE and stained using either 
IRDye Blue Protein Stain (LiCor) or Krypton Infrared Protein Stain (Pierce). 
Mass spectrometry. Full-length substrate and C-terminal cleavage products were 
purified from in vitro proteolysis assays by anti-Flag immunoaffinity isolation and 
analysed by matrix-assisted laser desorption/ionization-time of flight (MALDI- 
TOF) mass spectrometry using sinapinic acid matrix as described previously’. 
No statistical methods were used to predetermine sample size. 


31. Urban, S. & Wolfe, M. S. Reconstitution of intramembrane proteolysis in vitro 
reveals that pure rhomboid is sufficient for catalysis and specificity. Proc. Nat! 
Acad. Sci. USA 102, 1883-1888 (2005). 

32. Baker, R. P., Young, K., Feng, L., Shi, Y. & Urban, S. Enzymatic analysis of a 
rhomboid intramembrane protease implicates transmembrane helix 5 as the 
lateral substrate gate. Proc. Natl Acad. Sci. USA 104, 8257-8262 (2007). 
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EF-Hand 1 EF-Hand 2 

Homo sapiens - - DHW 

Macaca mulatta - - DHW 

Equus caballus -SAHW 

Mus musculus - - DHW, 

Rattus norvegicus -EDHW 

Pan troglodytes -DTLVPGSVOWF! 

Canis familiaris -- SDSTDLTL 

Bos taurus -- TSRTSAYQEF. 

Gallus gallus --V IMKMNREHF! 

Taeniopygia guttata -EDVLPMAAEDF'! 
Tetraodon nigroviridis ---------- 

Danio rerio -- DQWKCL j 

Monodelphis domestica - -LLQLIQEQQL 

Nasonia vitripennis -- AHWKA 

Apis mellifera --KHWKA 

Tribolium castaneum --AYYRS 

Culex quinquefasciatus --RKMRE 

Drosophila melanogaster - -RKMHE 

Aedes aegypti er 

Anopheles gambiae - -QHLRH 

Acyrthosiphon pisum -ESQWEM | 
Ciona intestinalis EQKDISWIFDOY-VGP 


consensus motif: DxDxDGxIxxxE DxDxDGxIxxxE 
N N N L D N NN L D 
s Vv s Vv 


HRA AAANARAARA 
VDONNANNNNNHNHN 
AARDANRNANUARUAUnRNUR 

K KIO RR Ww W010 101010101010 
2222222424 


euvvpv0umaNnaaaagaaagagaaa 


H. sapiens (human) NP_612201 
M. mulatta (rhesus monkey) NP_001181500 
E. caballus (horse) XP_05598010 
M. musculus (mouse) NP_631974 
R. norvegicus (rat) NP_001099289 
P. troglodytes (chimpanzee) XP_009430706 

-——_T. castaneum (red flour beetle) XP_972541 

= N. vitripennis (jewel wasp) XP_008206426 

A. mellifera (honeybee) XP_395780 
C. quinquefasciatus (southern house mosquito) XP_001844179 
D. melanogaster (fruit fly) NP_525084 
A. aegypti (yellow fever mosquito) XP_001653937 
A. gambiae (malaria mosquito) XP_31808 
P. humanus corporis (body lice) EEB13444 
B. malayi (roundworm nematode) XP_001896728 
A. pisum (pea aphid) XP_001947411 
C, intestinalis (sea squirt) XP_002130870 
M. domestica (grey short-tailed opossum) XP_007485620 
T. nigroviridus (pufferfish) CAF91194 
D. rerio (zebrafish) NP_001017556 
G. gallus (chicken) XP_415663 
T. guttata (zebra finch) XP_002194875 
B. taurus (bull) XP_615731 
C. familiaris (dog) XP_00562882 


Extended Data Figure 1 | The rhomboid-4 subfamily of rhomboid given below the alignment. b, Rooted tree of the 24 rhomboid-4 homologues. 
proteases. a, ClustalW multiple sequence alignment of the conserved Genus and species names are colour-coded as follows: primates (blue), other 
N-terminal EF-hand domains of 24 members of the rhomboid-4 subfamily mammals (green), birds (purple), fish (pink), non-vertebrate chordate (cyan), 
(generated in Biology Workbench, http://workbench.sdsc.edu). Identical nematode (orange), and insects (red), with vernacular names given in 
residues are shaded in green, highly conserved residues in yellow, and similar _ parentheses, followed by National Center for Biotechnology Information 
residues in cyan. The EF-hand calcium-binding loop consensus sequence is (NCBI) accession numbers. 
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Extended Data Figure 2 | Activity and thermostability analysis of DmRho4 
mutants. a, Comparison of calcium stimulation of DmRho4 versus its EF- 
hand domain deletion mutant (AEF), and a mutant lacking the entire cytosolic 
domain (AN). Upper diagram shows position of domains (demarked by 
residue numbers) and the corresponding deletion constructs. Transmembrane 
segments are shown as grey rectangles. GFP-Spitz substrate and cleavage 
products (green bands in the anti-GFP western) are denoted by black and white 
triangles, respectively. DmRho4 protein levels are shown as red bands (anti-HA 
western). b, Analysis of DmRho4 loop 2, 4, and 6 mutant protein levels from 
Fig. 2f (calcium stimulation conditions). c, DmRho4 loop 2, 4, and 6 mutants 
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were assayed for cleavage of GFP-Spitz under basal (unstimulated) conditions 
for ~24h in the absence of calcium. Cleavage product (green bands, white 
arrowhead) was detected in media fractions for most of the mutants at levels 
comparable to the wild-type enzyme. Corresponding DmRhoé protein levels 
are shown as red bands (anti-HA western analyses). d, Wild-type DmRho4 and 
engineered variants were expressed and purified from bacteria, subjected to 
quantitative thermal stability analysis, and transition temperature midpoints 
(Tm) were derived (error bars, s.d. of four experimental replicates). The thermal 
stability of mutant DmRho4 proteases was indistinguishable from that of wild- 
type DmRho4. 
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Extended Data Figure 3 | Calcium does not regulate DmRho4 through 
intermolecular interactions. a, Anti-Flag co-immunoprecipitation 

analysis of HA~-DmRho4 and APP-Spi7-Flag substrate from 
proteoliposomes in the presence or absence of 0.5 mM calcium. An inactive 
mutant of DmRho4 (S299A) was used to facilitate substrate complex isolation. 
The amount of HA-tagged DmRo4 co-immunoprecipitated with the Flag- 
tagged substrate was not affected by the presence of 0.5 mM calcium. L, load; B, 


+Ca 


HA-Rho4 
E382A+ 
kDa —WT_ H358A 382A H358A 
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bound. b, Anti-Flag co-immunoprecipitation of Flag-DmRho4 and HA- 
DmRhoé4 from proteoliposomes. HA-tagged DmRho4 failed to co- 
immunoprecipitate with Flag-tagged DmRho4 in both the absence and 
presence of 0.5mM calcium. ¢, Mixing a catalytic mutant (H358A) and a 
calcium-binding mutant (E382A) cannot rescue calcium stimulation in trans 
(star indicates lane where a product would be expected with the mixed single 
mutants). 
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Extended Data Figure 4 | Lateral substrate gating underlies direct 
regulation of intramembrane proteolysis. a, Thermostability analysis of 
single and double cysteine mutants of DmRhoé (error bars, s.d. of four 
experimental replicates). b, Average relative proportions of cleavage at the 
external cleavage site (orange) compared with the internal cleavage site (blue) 
are shown for DmRho4 in the absence (no Ca) and presence (+ Ca) of 1mM 
calcium (error bars, standard error of replicate experiments). The external site 
was favoured in the absence of calcium (approximately 80%) while internal 
cleavage was preferred in the presence of calcium (approximately 70%). 
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c, DmRho4 loop 4 and loop 6 calcium-binding site mutants retained calcium- 
independent cleavage of a substrate harbouring only an external cleavage site. 
Full-length substrate (filled triangle) and cleavage product (open triangle) are 
indicated. d, Cleavage of a substrate with external and internal cleavage sites 
was compared for E. coli GlpG, P. stuartii AarA, and V. cholerae Rhol in the 
absence (no Ca) or presence (+ Ca) of 0.5 mM calcium. The relative 
proportions of cleavage at the two sites varied between the bacterial rhomboid 
proteases, but in no case did calcium alter the cleavage site preference. 
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Structures of actin-like ParM filaments show 
architecture of plasmid-segregating spindles 


Tanmay A. M. Bharat!, Garib N. Murshudov!, Carsten Sachse” & Jan Lowe! 


Active segregation of Escherichia coli low-copy-number plasmid 
R1 involves formation of a bipolar spindle made of left-handed 
double-helical actin-like ParM filaments'’*®. ParR links the fila- 
ments with centromeric parC plasmid DNA, while facilitating 
the addition of subunits to ParM filaments*’°. Growing 
ParMRC spindles push sister plasmids to the cell poles”’®. Here, 
using modern electron cryomicroscopy methods, we investigate 
the structures and arrangements of ParM filaments in vitro and 
in cells, revealing at near-atomic resolution how subunits and fila- 
ments come together to produce the simplest known mitotic 
machinery. To understand the mechanism of dynamic instability, 
we determine structures of ParM filaments in different nucleotide 
states. The structure of filaments bound to the ATP analogue 
AMPPNP is determined at 4.3A resolution and refined. The 
ParM filament structure shows strong longitudinal interfaces 
and weaker lateral interactions. Also using electron cryomicro- 
scopy, we reconstruct ParM doublets forming antiparallel spindles. 
Finally, with whole-cell electron cryotomography, we show that 
doublets are abundant in bacterial cells containing low-copy-num- 
ber plasmids with the ParMRC locus, leading to an asynchronous 
model of R1 plasmid segregation. 

Using electron cryomicroscopic (cryo-EM) images collected on a 
direct-electron detector, we performed real-space helical reconstruc- 
tion to elucidate a 4.3 A structure of ParM filaments assembled with 
the nucleotide AMPPNP (Fig. la-c and Extended Data Fig. 1, 
Extended Data Table 1 and Supplementary Video 1). Densities cor- 
responding to a-helices, $-strands and many side chains were clearly 
observed (Fig. 1d-g). AMPPNP was also observed in our map as strong 
density, especially on the phosphates (Fig. 1h). No apparent resolution 
anisotropy was detected in the reconstruction (Extended Data Fig. 1), 
indicating that the entire ParM protein is rigidly held in the filament. 
To derive an atomic model of the ParM filament, a previous, mono- 
meric crystal structure of ParM and AMPPNP bound to the tail of 
ParR (Protein Data Bank (PDB) accession code 4A62) was fitted into 
the map, and the filament model iteratively rebuilt and all-atom 
refined using stereochemical restraints with REFMAC. 

Surprisingly, the two protofilaments (strands) making up the dou- 
ble-helical ParM filament are held together only by salt bridges (Fig. 2a, 
b, Extended Data Figs 2 and 3 and Extended Data Table 2). The ParM 
inter-protofilament interface is small (calculated interface area 371 A”) 
and does not resemble a canonical protein-protein interface contain- 
ing a hydrophobic core. To demonstrate the validity of this assessment 
we mutated two positively charged residues within the inter-protofila- 
ment interface to aspartic acids (K258D, R262D) and tested what effect 
this had on the stability of ParM filaments. Filament formation (with 
AMPPNP) from the resulting mutant protein ParM (K258D, R262D) 
was inefficient (Extended Data Fig. 3g). The few filaments that were 
formed were unstable, and tended to be bent (Fig. 2c and Extended 
Data Fig. 3h). Reference-free class averaging of these filaments showed 
that even though most of the few observed filaments were double 
helical like wild-type ParM, some single-helical filaments were also 


present (Fig. 2d and Extended Data Fig. 3i). These observations indi- 
cate that although the interface between protofilaments in ParM is 
surprisingly small, it is sufficient for double-helical filament assembly 
since many identical contacts along the filament contribute to the 
overall binding energy. Different actin-like proteins show very 
different filament arrangements, from single (crenactin, possibly’’) 
to parallel double helical (left-handed: ParM; right-handed: actin; 
and non-staggered: MamK"') and antiparallel, double straight 


Figure 1 | Cryo-EM reconstruction at 4.3 A of ParM+AMPPNP filaments. 
a, Cryo-EM image of ParM+AMPPNP filaments. Inset: class average. This 
experiment was repeated nine times. b, A 4.3 A reconstruction of the filaments, 
isosurface contoured at 2 away from the mean (see Extended Data Fig. 1 and 
Supplementary Video 1). c, The same reconstruction as b, overlaid with the 
refined atomic model with individual ParM subunits coloured differently. 
d-g, Enlarged regions of the cryo-EM map showing resolved secondary 
structure elements and side-chain densities, contoured at lo. h, Density for the 
nucleotide is stronger than that of the protein (contoured at 3a). 
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Figure 2 | ParM filaments are made up of two protofilaments held together 
by salt bridges, which are perturbed when ParM is bound to ADP. a, The 
refined atomic model of ParM+AMPPNP filaments shows that the 
protofilaments are held together laterally by salt bridges. Basic residues at the 
interface are highlighted in red and acidic residues in orange (see Extended 
Data Table 2). Within the protofilaments’ longitudinal interfaces, more 
extensive hydrophobic interactions are observed (see Extended Data Fig. 2). 
b, A magnified view of a. The charges of two basic residues at the interface were 
inverted by mutation for c (K258D, R262D). c, The resulting protein formed 
filaments inefficiently. Cryo-EM image showing filaments of ParM(K258D, 
R262D) assembled with AMPPNP. This experiment was repeated four times. 
d, In addition to normal double-helical filaments, some single-helical filaments 


(MreB). We propose that small and simple inter-protofilament con- 
tacts could have made it possible to change inter-protofilament 
arrangements relatively easily during evolution since all these actin- 
like filaments show similar longitudinal contacts’’. 

The protofilaments of ParM themselves are held together by an 
extensive longitudinal contact area (~995 A”), containing both hydro- 
philic and hydrophobic interactions (Extended Data Fig. 2 and 
Extended Data Table 2). Actin filaments have also been shown to have 
the same difference in interface size between the longitudinal and 
lateral contacts'*'®. Interestingly, this difference has also been 
observed in tubulin polymers, microtubules’’. 

The dynamic instability of ParM is caused by intrinsic ATP hydro- 
lysis in the filament and the resulting ADP-bound filament being less 
stable’’, while being temporally protected by an ATP cap. We therefore 
assembled ParM+ ATP filaments and obtained a 7.5 A cryo-EM struc- 
ture of these filaments (Extended Data Fig. 4). Since the nucleotide 
state of this structure may be mixed, we devised a way to inhibit 
the ATPase of ParM with vanadate. Addition of sodium orthovanadate 
to the ParM+ATP solution retarded filament disassembly and we 
captured these ParM+ATP+vanadate filaments before complete 
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were observed by image classification and averaging. e, Fourier shell correlation 
(FSC) curves for the four cryo-EM structures presented in this study (see 
Extended Data Table 1). f, Cryo-EM image of ParM+ADP filaments. High 
protein concentrations were required to obtain these filaments and monomeric 
proteins can be seen. This experiment was repeated six times. g, Comparison of 
filtered class averages of ParM+ATP and ParM+ADP filaments. Compared 
with the ATP bound state, the pitch of the ParM+ADP filaments reduced by 
~3 A (see Supplementary Video 2). h, Cryo-EM reconstruction of ParM+ ADP 
filaments at 11 A resolution with five copies of the ParM+ ADP X-ray structure 
fitted. i, The same pseudo-atomic fit without the cryo-EM density. j, A 
magnified view of the perturbed inter-protofilament interface in the 
ParM+ADP filaments. 


disassembly and obtained a 6.4A structure (Extended Data Fig. 4). 
Comparison of the three cryo-EM structures (+AMPPNP, +ATP, 
+ATP+vanadate) indicates that ParM is held in the same rigid, com- 
pact conformation, either until ATP is hydrolysed to ADP or until 
phosphate is released (Extended Data Fig. 4e, f). 

Therefore the state with an expected conformational change should 
be ADP-bound, and since ParM+ADP has a much higher critical 
concentration for filament formation, we incubated a concentrated 
solution of ParM with ADP for cryo-EM (Fig. 2f). This specimen 
yielded a lower resolution reconstruction at 11 A. Further refinement 
was not possible, and adding data did not improve the resolution of the 
structure (Fig. 2e and Extended Data Table 1). This indicated marked 
flexibility in the ParM+ADP filaments. Surprisingly, the overall hel- 
ical pitch of the ParM+ADP filaments is smaller than in the other 
nucleotide states (ADP: 51 A versus 54 A; Fig. 2g and Extended Data 
Table 1). The previously solved ParM+ADP X-ray structure (PDB 
1MWM) was subdivided into its two domains and these were fitted 
as rigid bodies into the ParM+ADP cryo-EM reconstruction 
(Fig. 2h, i). Since the helical symmetry of ParM+ADP filaments is 
different from the ParM+ATP filaments, the interaction of ParM 
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Figure 3 | ParM doublets formed in vitro. a, Cryo-EM images of ParM 
doublets formed in vitro with crowding agent PEG 6000. This experiment was 
repeated 15 times. b, Slice through an electron cryotomogram (cryo-ET) 
showing clear lack of super-helicity in the doublets (see Supplementary Video 
3). ¢, A two-dimensional class average of the ParM doublet. The thickest parts 
of double helical ParM filaments have been indicated with yellow arrowheads 
(see Extended Data Fig. 5). d, Model of the doublet, shown in the same 
orientation as the class average in c (see Supplementary Video 4). e, An 
orthogonal, magnified view of the doublet cut at the plane shown as a dashed 
line in d. f, Atomic model of the doublet. Residues shown in red in one ParM 
filament interact with residues in orange in the other filament (see Extended 
Data Table 2). g, An orthogonal view of the doublet, with the filament axes 
going into the plane of the paper. One of the residues (S19) that forms the 
doublet interface has been highlighted (see Extended Data Fig. 6). 


subunits with each other is also different in the two states. In 
ParM+ ADP filaments, salt bridges at the inter-protofilament interface 
can no longer be formed and rather repulsing charges are brought close 
together (Fig. 2j). Additionally, change in helical pitch of the filament 
may also come with a substantial change in the longitudinal interface. 
These two factors together could explain why ParM+ ADP filaments 
are less stable, and indicate why ParM filaments rapidly dissociate into 
monomeric form upon ATP hydrolysis, leading to dynamic instability 
(Supplementary Video 2). 

Having described the structure of the ParM filaments, we then 
wished to put the structural data in context of the bipolar spindles that 
segregate plasmid DNA in cells. For bipolar spindles to form, filament- 
ous ParM subunits must engage in another interaction, inter-filament 
contacts, formed between double-helical filaments. It is known that 
incubation of ParM filaments with a crowding agent causes them to 
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bundle’. However, bundles are not amenable to high-resolution cryo- 
EM analysis because of their heterogeneity~®. To obtain a more defined 
sample, we titrated ParM+AMPPNP with varying amounts of crowd- 
ing agent. When 2% polyethylene glycol (PEG) 6000 was added to 
ParM+AMPPNP, we found that ParM filaments dimerized to form 
‘doublets’, containing two double-helical filaments (Fig. 3a and 
Extended Data Fig. 5a, b). In raw cryo-EM images, doublets appeared 
as two roughly parallel lines, with no evidence of supercoiling or twist- 
ing. Electron cryotomography (cryo-ET) of the doublet specimen con- 
firmed that the filaments do not twist around each other (Fig. 3b and 
Supplementary Video 3). 

We then performed reference-free two-dimensional classification of 
doublet images (Fig. 3c and Extended Data Fig. 5c). The two ParM 
filaments in the doublet were perfectly out of phase with each other. 
When viewed as a projection (in a cryo-EM class average), the thickest 
part of one filament in the doublet perfectly aligns with the thinnest part 
of the other double helical filament. We picked small segments along 
single ParM filaments that formed the doublets and aligned the segments 
to re-projections of the high-resolution ParM+AMPPNP structure we 
solved above. Using this alignment, directionality could be assigned to 
each filament in the doublet. We found that, in 84% of the cases, ParM in 
vitro doublets appeared to be made of two anti-parallel filaments 
(Extended Data Fig. 5d) while opposite matches were probably due to 
incorrect assignment of the short segments. 

Using the class averages and the directionality assignment, we 
obtained an averaged model for the ParM doublet (Fig. 3d-g, 
Extended Data Fig. 5e, f and Supplementary Video 4). Two ParM 
monomers from adjoining filaments in the doublet model were found 
to be ina similar orientation as observed in a previous crystal structure 
of ParM (Extended Data Fig. 6a, b)’. The model of the doublet predicts 
residues in ParM that should be important in doublet formation 
(Fig. 3f, g and Extended Data Table 2) and confirmed earlier work, 
including mutations that modulate the strength of the inter-filament 
contact. One such set of mutations consisted of S19R and G21R’. 
These mutations had been selected previously because they are located 
the furthest away from the filament axis, essentially sticking out, but 
are shown here directly to be involved in the inter-filament contact. In 
line with this, mutant ParM(S19R, G21R) spontaneously formed 
doublets and bundles (Extended Data Fig. 6c), without any crowding 
agent present in solution, validating both the previous total internal 
reflection fluorescence data’ as well as the current atomic model of the 
ParM doublet. 

Previous imaging by total internal reflection fluorescence micro- 
scopy of the reconstituted ParMRC spindles’ as well as the model of 
the ParM doublet derived here are in vitro experiments. To test 
whether the doublets have physiological relevance, we visualized 
ParM filaments inside growing E. coli cells. Previously, direct obser- 
vation of ParM filaments by cryo-EM was only possible by cryo-sec- 
tioning of frozen bacterial cells since whole cells were deemed too 
thick’*. Importantly, in vitreous sections filaments could only be visua- 
lized end-on, not revealing much about the inter-filament contacts. 
Using new direct electron detectors, signal-to-noise has been dramat- 
ically improved, so we aimed at imaging bipolar spindles directly 
inside cells using whole-cell cryo-ET. 

As a first test, we overexpressed a mutant of ParM (D170A) that 
hydrolysed ATP much more slowly in thin E. coli cells. As observed 
previously in vitreous sections’, cryo-ET of these cells (Fig. 4a) 
allowed unambiguous identification of the overexpressed ParM 
mutant protein through its tendency to form extremely large bundles. 

We then used plasmids with different copy numbers”’, all of 
which contained the entire ParMRC locus and transformed them 
in turn into E. coli cells. Cryo-ET of these cells revealed the presence 
of doublets in all cases (Fig. 4b-d, Supplementary Videos 5 and 6, 
Extended Data Fig. 7 and Extended Data Table 3). All doublets were 
roughly aligned with the long cell axis, and were never observed 
perpendicular to the cell axis. Although bundles were observed in 
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Figure 4 | ParM doublets in E. coli cells, imaged by cryo-ET. a, A mutant of 
ParM that hydrolyses ATP more slowly (D170A) was overexpressed in E. coli 
cells. Tomographic slices show large bundles of ParM blocking cell division. 
This experiment was performed two times. b, The ParMRC operon driven from 
high-copy-number plasmid pDD19. Tomographic slice showing an example of 
observed doublets. c, Tomographic slice for a medium-copy-number plasmid 
(pKG321). d, Tomographic slice for a low-copy-number plasmid, emulating 
the native low-copy-number R1 plasmids (pKG491, ‘mini-R1’ replicon) in E. 
coli (see Supplementary Videos 5 and 6 to view entire tomograms). Each 
experiment with different copy-number plasmids was performed once. 

e, Schematic depicting proposed asynchronous plasmid DNA segregation. 
Bipolar ParM spindles are seeded when replication has produced two parC 
centromeric regions, still in close proximity. Each seeds one unipolar ParM 
filament, which then come together in an antiparallel fashion to form the 
segregating bipolar spindle. Non-productive unipolar filaments or spindles that 
lack plasmid attachment will be destroyed through the dynamic instability of 
ParM. This is in contrast to earlier ideas in which all sister plasmids would be 
segregated through one bundle of filaments, containing double the number of 
unipolar filaments as the copy number of the plasmid in the cell’’. 


the high- and medium-copy-number plasmid cases, they were not 
observed in the low-copy-number (mini-R1) case, where partition- 
ing via ParMRC is required for plasmid stability**. These cryo-ET 
data are in line with previous immuno-light microscopy data, where 
single pole-to-pole filaments were only observed in 40% of cells’"® 
and the other cells showed several localized clusters or more com- 
plex patterns. 

The above data showed that ParM doublets are found in cells con- 
taining the ParMRC locus, and are probably the machinery that act- 
ively segregates plasmid DNA to opposite ends of the dividing cell, 
even though antiparallel arrangement of ParM filaments in cellular 
doublets can only be inferred from the in vitro studies above. It is 
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interesting to observe that the ratios of doublets observed per cell were 
the same as the ratios of the expected copy numbers of the three 
plasmids, although it needs to be noted that numbers remain small 
because of the low-throughput nature of cryo-ET (Extended Data 
Table 3). The ratios might indicate that each doublet carries a defined 
payload of DNA cargo, a fixed number of plasmids containing the 
parC locus. We propose that ParMRC spindles consisting solely of 
doublets elegantly circumvent the problem of synchronizing plasmid 
replication, filament attachment and bundle formation for all plasmids 
in the cell: each pair of plasmid sisters is segregated by their own 
spindle. The resulting asynchronous plasmid segregation is schematic- 
ally summarized in Fig. 4e. Indeed, it is known that R1 plasmids are 
replicated randomly throughout the cell cycle****. In contrast, eukar- 
yotic DNA segregation requires cohesion, kinetochore checkpoints 
and other dedicated machinery since all material is segregated with 
one coordinated and synchronized spindle. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. 
Protein expression and purification. ParM (UniProt: PARM_ECOLX) and 
ParM mutants were expressed from plasmid pJSC1 and its derivatives” in E. coli 
BL21-Al cells and purified as described previously*”*. Buffer MR was used in all 
experiments: 50 mM Tris-HCl, 100 mM KCl, and 1 mM MgCh, pH 7.0. 
Wild-type ParM and ParM(S19R, G21R). ParM was purified by ammonium sul- 
phate precipitation (at a final concentration of 10% (sat.) ammonium sulphate) of 
the lysate, followed by addition of ATP to the re-suspended pellet. ParM filaments 
were pelleted by centrifugation at 100,000g, and the resulting pellet containing 
pure protein was re-suspended in buffer and further purified by size exclusion 
chromatography on a Sephacryl S-200 column (GE Healthcare). 

ParM(K258D, R262D). The protein was purified using a 5ml HiTrap Q HP 
column (GE Healthcare), and eluted at increasing KCl concentrations. Fractions 
containing ParM were pooled and further purified by size exclusion on a Sephacryl 
S-200 column (GE Healthcare) into buffer MR. Concentrated aliquots of pure 
protein were frozen and stored at —80°C until further investigation. 

Sample preparation for microscopy. ParM+AMPPNP and ParM+ATP. ParM 
protein (10 1M) was incubated with 2 mM nucleotide in a total volume of 100 pil for 
5 min at room temperature (22 °C) before cryo-EM sample preparation. The same 
procedure was used for polymerization of the ParM S19R, G21R mutant. 
ParM+ATP+vanadate and ParM+ ADP-+ vanadate. ParM protein (10 }1M) was 
incubated with 2mM ATP or ADP and 4mM sodium orthovanadate in a total 
volume of 100 ul for 2h or 5 min at room temperature before cryo-EM sample 
preparation. 

ParM+ADP. 400 tM ParM was incubated with 10 mM ADP ina total volume of 
25 tl for 5 min at room temperature. 

ParM(K258D, R262D)+AMPPNP. Protein (601M) was incubated with 2mM 
AMPPNP in a total volume of 100 ll for 5 min at room temperature. 

ParM in vitro doublets. ParM protein (201M) was incubated with 2mM 
AMPPNP in the presence of 2% (w/v) PEG 6000 in a 100 pl for 5 min at room 
temperature. 

ParM(D170A)-overexpressing cells. ParM was expressed to high levels for cryo-ET 
using the plasmid pRBJ212 (ParM(D170A), ptac promoter)’ transformed into E. 
coli B/R266 cells. Cells were grown in M9 medium at 30 °C and induced with 1 mM 
IPTG at an attenuance D¢oo nm ~ 0.5. Samples were prepared 4h after induction. 
Bacterial cells with different copy-number plasmids containing the ParMRC locus. E. 
coli strain B/R266 (ref. 26) was transformed with high- (pDD19), medium- 
(pKG321) or low-copy (pKG491) plasmids and grown in M9 medium supplemented 
with 100 pg ml ampicillin at 30 °C (ref. 21). Cells were grown to Dgoo nm ~ 0.4-0.6 
(grown to logarithmic growth phase) before sample preparation for cryo-EM. 
Cryo-EM grid preparation. Samples for cryo-EM were prepared by pipetting 
2.5 pl of the sample onto a freshly glow-discharged Quantifoil Cu/Rh 200 mesh 
grids (R2/2 for purified protein, and R3.5/1 for cellular tomography) and plunge 
frozen into liquid ethane in a Vitrobot Mark IV (FEI). Only for cryo-ET, 11 pl of 
sample were pre-mixed with 1 pl of protein-A conjugated with 10 nm colloidal 
gold (Cell Microscopy Center, Utrecht University, The Netherlands). Plunged 
grids were transferred to liquid nitrogen and stored. 

Electron microscopy data collection. Two-dimensional cryo-EM data were col- 
lected using either an FEI Krios microscope operated at 300 kV or a FEI Spirit 
microscope operated at 120kV. High-throughput data were collected on the FEI 
Krios using EPU software at an unbinned calibrated pixel size of 1.30 A or 1.07 Aon 
a Falcon II direct electron detector. A combined total dose of 25-32 electrons per 
square angstrém was applied with each exposure that lasted 1s. Images were 
collected at 1-6 um underfocus. Tilt series data were collected on an FEI Krios 
equipped with a Quantum energy filter (Gatan) using SerialEM software’, on a K2 
direct electron detector operating in counting mode. Tilt series data were typically 
collected from +60° with 1° tilt increment at 4-12 tum underfocus with a combined 
dose of about 120 electrons per square angstrém applied over the entire series. 
Image processing and data analysis. Real-space helical reconstruction. An aver- 
aged power spectrum for each cryo-EM image was calculated using CTFFIND”, 
and images showing clear Thon rings were retained. ParM filaments were 
extracted from the selected images using SPRING and EMAN2””. The helical 
symmetry of each sample was accurately determined by comparing the power 
spectrum of the aligned segments with power spectra of re-projections of the 
calculated reconstructions. Experimentally determined helical parameters 
(Extended Data Table 1) were used for refinement using the program 
segmentrefine3D in SPRING. The final volumes were compensated for the 
B-factor and filtered to the obtained resolutions (Extended Data Table 1). 
Resolution of the structure was estimated using gold-standard Fourier shell cor- 
relation measurements in SPRING and additionally using ResMap”’. Visualization 
of densities was performed in UCSF Chimera”. 
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Atomic model building. The atomic structure from PDB 4A62 (ref. 3) was fitted 
into the cryo-EM density of ParM+AMPPNP using MolRep*. Maximum-like- 
lihood refinement of the atomic structure against the cryo-EM density was per- 
formed in REFMACS (ref. 34) using standard protein stereochemistry and 
additional external restraints based on PDB 4A62, generated in ProSMART™. 
Model building was performed in COOT and MAIN**””. 

Rigid body fitting. ParM was divided into two sub-domains, based on the previous 
ParM+ADP X-ray structure (PDB 1MWM). Each sub-domain was declared as a 
rigid body and these were fitted into the ParM+ADP filament structure using 
REFMACS. 

Polarity assignment of ParM filaments in doublets. First, images of ParM doublets 
were carefully selected on the basis of image quality (as assessed by a visual inspection 
of power spectra), and by a visual assessment of the distance between the two fila- 
ments in the doublet. The assumption made from the appearance of the class averages 
was that images in which the distance between the centres of two ParM filaments in 
the doublet was maximum would show ParM filaments entirely in the same x-y 
plane (the z-axis being the path of the electron beam in the microscope). The two 
ParM filaments in all the doublets in these selected images were picked manually 
using EMAN2 (ref. 30). The manual pick was used to extract short segments along 
each filament in the doublet. The extracted segments were aligned to re-projections of 
the high-resolution ParM+AMPPNP filament model using SPRING”. In five out of 
the six doublets analysed, the assigned directionality of ParM filaments was predo- 
minantly anti-parallel and in one case the assignment was mostly parallel. 
Derivation of the doublet model. The ParM doublet is not a true helical specimen, 
thus conventional helical reconstruction could not be performed. This difficulty of 
characterizing higher-order filament structures of ParM filaments has been prev- 
iously reported”. The average distance between the centres of the two ParM 
filaments in the doublet was found to be 65.1 A by analysis of one-dimensional 
line-profiles of all obtained class averages. Two copies of the high-resolution cryo- 
EM structure of the ParM+AMPPNP filaments were accordingly placed with 
their centres 65.1 A apart in space in an anti-parallel orientation. The placement 
was repeated for all possible combinations of the azimuthal angles of both fila- 
ments. Re-projections of all these resulting volumes were aligned with all obtained 
class averages. As expected intuitively from an inspection of the class averages, 
models in which the thickest part of one ParM filament overlapped with the 
thinnest part of the other filament in the doublet had higher cross-correlation 
scores. We placed two copies of the atomic structure of the ParM+AMPPNP in 
the volume with the highest score. Since this was not a standard cryo-EM recon- 
struction, meaning resulting atomic accuracy would be somewhat lower, we only 
used the Ca atoms for determining distances shown in Extended Data Table 2. 
Tomographic reconstructions. Tilt series data were aligned using IMOD* and 
three-dimensional reconstructions were conducted using the SIRT algorithm imple- 
mented in Tomo3D”. Visualization of data was done using IMOD and UCSF 
Chimera”. 
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ParM+AMPPNP filament structure 
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Extended Data Figure 1 | Resolution estimate of the ParM-+ AMPPNP power spectrum of the aligned segments (left) compared with the power 
reconstruction. a, Resolution of the ParM+AMPPNP reconstruction was spectrum of the re-projection of the cryo-EM reconstruction (right). A 


estimated using ResMap and this estimate was plotted back onto the cryo-EM __ reflection is observed in both cases at 4.8 A’, indicating that the resolution 
density. Blue indicates high resolution; red indicates lower resolution. b, The extends beyond this shell. See Fig. 2e for Fourier shell correlation curves. 
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Intra-protofilament interactions 


Extended Data Figure 2 | Intra- and inter-protofilament interactions in 
ParM filaments. a, Atomic model of one protofilament (strand) of ParM is 
shown with the residues at the protein-protein interface highlighted in red. See 
Extended Data Table 2 for a detailed list. b, A magnified view of the interface. 
Three residues at the interface have been labelled. c, The complete ParM 


filament (that is, both protofilaments/strands) shown end-on. d, Atomic model 
of the ParM filament with the inter-protofilament residues at the protein- 
protein interface highlighted in orange. e, A magnified view of d. Salt bridging 
residues are labelled. f, An orthogonal view of d. See Extended Data Table 2 fora 
detailed list of interacting residues. 
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ParM inter-protofilament interface is small but important 
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Extended Data Figure 3 | The ParM inter-protofilament interface is small 
but important. a, Cryo-EM density for the ParM+AMPPNP filament is 
shown at an isosurface contour level of 2.0c from the mean value. Overlaid on 
the density, refined atomic coordinates from REFMAC are additionally 
displayed as grey ribbons. Residues forming salt bridges at the inter- 
protofilament interface are highlighted. b, The same figure as a, except the cryo- 
EM density shown at an isosurface contour level of 1.50 from the mean; ¢, 1.00 
from the mean. d-f, A magnified view of the primary salt-bridged interface 
consisting of charged residues that form the ParM inter-protofilament 
interface. The cryo-EM density is shown as a mesh at three different contour 
levels to demonstrate resolved side-chain densities. Positively charged residues 
are highlighted in red; negatively charged residues are highlighted in orange. 
g, Two residues (K258 and R262) that were the best resolved (marked with an 
asterisk in d), were mutated to aspartic acid to test the importance of this inter- 
protofilament interface. A cryo-EM image of this mutant protein assembled 
with AMPPNP is shown. A much higher concentration of the protein was 
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required to obtain filaments on cryo-EM grids (Methods). This experiment was 
repeated four times. h, Randomly selected cryo-EM images of 
ParM+AMPPNP and ParM(K258D, R262D)+AMPPNP were used to count 
occurrences of straight and bent filaments by visual inspection. The results of 
this quantification are shown as a percentage bar diagram. For the ParM 
protein, 82% of all filaments were classified as straight, while 18% were bent 
(n = 345). Using exactly the same classification criteria, only 15% of the 
filaments were found to be straight and 85% of the filaments were bent (n = 45) 
for the ParM(K258D, R262D) mutant protein. i, Reference-free class averages 
show that most of the ParM(K258D, R262D) filaments are made up of double 
protofilaments like wild-type ParM. Some class averages show evidence of 
bending. A few class averages show that single protofilaments were present in 
the sample (lower panels). However, the double mutation destabilizes the entire 
ParM filament, making filament formation an unfavourable reaction, 
illustrating that even though the inter-protofilament interface is small, it is 
critical for ParM filament formation. 
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Extended Data Figure 4 | ParM adopts a compact conformation until ATP 
is hydrolysed to ADP or until phosphate is released. a, ParM protein (10 1M) 
was incubated with ATP (2 mM) and cryo-EM samples were prepared after 

5 min. Many filaments were observed on the grid. This experiment was 
repeated ten times. b, After 2h, no filaments were seen in the same reaction. 
Presumably, ATP had been hydrolysed and ParM had returned to monomeric 
form. This experiment was repeated three times. c, When sodium 
orthovanadate (4 mM) was included in the reaction, filaments could be 
observed, even after 2 h. This experiment was repeated three times. d, The same 


10M ParM+ATP, 2 h incubation 
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10uM ParM+ATP+vanadate, 2 h incubation 


reaction as a, except ATP was replaced by ADP. No filaments were observed in 
this reaction. This experiment was repeated four times. e, f, We performed real- 
space helical reconstruction of the ParM+ATP filaments (red) and 
ParM+ATP+vanadate filaments (yellow), and compared them with the 
ParM+AMPPMP filament structure (green). Comparison shows that ParM is 
held in a very similar conformation until hydrolysis of ATP is complete or until 
phosphate is released since we currently cannot distinguish these two possible 
effects of vanadate. See Fig. 2e for resolution estimates and Extended Data Table 
1 for image-processing statistics. 
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Extended Data Figure 5 | Model of the ParM doublet. a, A cryo-EM image of _ triangles, coloured on the basis of the cross-correlation score in the alignment 


ParM+AMPPNP + 2% PEG 6000. Instances of doublets are marked with procedure: red indicates a poor cross correlation score; green indicates a 
yellow arrowheads. This experiment was repeated 15 times. b, More good score. e, A schematic model of the anti-parallel ParM doublet. 
examples of ParM doublets observed in cryo-EM. c, Class averages of the Directionality is indicated with a yellow arrow. f, The thickest parts of ParM 
doublets. d, Directionality assignment of the filaments in the doublet. filaments of the doublet (as they appear in projection) are marked with black 
Individual sub-segments and their assigned directionality are indicated by arrowheads. 
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a ParM cryoEM doublet b ParM doublet expected 
model from X-ray contacts 


Extended Data Figure 6 | Validation of the doublet model. a, Two ParM Extended Data Table 2), $19 and G21, were mutated to arginine to improve 
filaments arranged in an anti-parallel orientation, as obtained from the ParM affinity of ParM filaments to each other. Cryo-EM images of the mutant protein 
cryo-EM doublet model. b, Two ParM filaments arranged in an anti-parallel | with AMPPNP show spontaneous doublet formation and filament bundling 
orientation, obtained from crystal packing of a monomeric ParM X-ray without crowding agent, validating the doublet model. This experiment was 
structure (PDB 4A62)*. c, Two residues at the interface of the doublet (see repeated six times. 
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Extended Data Figure 7 | ParM bundles and doublets observed in vivo. (yellow arrows) observed in these cells. a, c, e, h, Cells transformed with 
a-k, E. coli B/R266 cells were transformed with a high-copy (pDD19) or the high-copy-number plasmid; b, d, f, g, i, j, k, cells transformed with the 
medium-copy (pKG321) plasmid containing the ParMRC locus. medium-copy-number plasmid. Each experiment with different copy- 
Transformed cells were grown to log phase and then prepared for cryo-EM. number plasmids was performed only once owing to the low-throughput 
This figure shows a gallery of ParM bundles (blue arrows) and doublets nature of cryo-ET. 
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Extended Data Table 1 | Image processing statistics for cryo-EM reconstructions of ParM filaments 


ParM+AMPPNP ParM+ATP. ParM+ATP+ ParM+ADP 
vanadate 

Resolution, FSC at 0.143 (A) 4.3 7.5 6.4 11.0 
Filament pitch (A) 54.0 54.0 53.8 51.0 
Subunits per turn 2.18 2.18 2.18 2.18 
Input segment step size (A) 268 161 161 153 
Segment size for alignment (A?) 364x364 400x400 400x400 700x700 
Asymmetric units included 561,231 13,825 122,864 53,648 
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Extended Data Table 2 | Interfaces forming residues of ParM 


Interface Residue Sequence 
Numbers 

Inter-protofilament 1 MLVF IDDGSTNIKLOWQESDGTIKQHISPNSFKREWAVSFGDKKVFNYTLN@BOYSFDPI 

interface 61 SPDAVVTTNIAWOY SDVNVVAVHHALLTSGLPVSEVDIVCTLPLTEY YDRNNOPNTENIE 
121 RKKANFRKKITLNGGDTFTIKDVKVMPESIPAGYEVLQELDELDSLLIIDLGGTTLDISQ 
181 VMGKLSGISKIYGDSSLGVSLVTSAVKBALSLARTKGSSYLADDIIIHRKDNNYLKORIN 
241 DENKISIVTEAMNEALBRLEQRVLNTLNEFSGYTHVMVIGGGAELICDAVKKHTQIRDER 
301 FFKTNNSQYDLVNGMYLIGN 

Intra-protofilament = 1 MLVF IDDGSTNIKLQWQESDGTIKQHISPNSFKREWAWSEGDKAVFNYTLNGEQYSFDPE 

interface 61 SPDAVVTTNIAWQYSDVNVVAVHHALLTSGLPVSEVDIVCTLPLTEYMDRNNOPNTENIE 
121 RKKANFRKKITLNGGDTFTIKDVKVMBES IPAGYEVLOELDEBDSLLIIDLGGTTLDIS 
181 GKLS@ESKI YGDSSLGVSLVTSAVKDALSLARTRGSSYLADDI I IHRKDNNYLRORI 
241 ENKISIVTEAMNEALRKLEQRVLNTLNEFSGYZHVMVIGGGAELICDAVKKHT@IRDE 
301 FFKTNNSQYDLVNGMYLIGN 

Inter-filament 1 MLVF IDDGSTNIKLOWQESDGTIKOHISPNSFKREWAVSFGDKKVFNYTLNGEQYSFDPI 

interface 61 SPDAVVTTNIAWOYSDVNVVAVHHALLTSGLPVSEVDIVCTLPLTEY YDRNNOPNTENIE 

(doublet) 121 RKKANFRKKITLNGGDTFTIKDVKVMPES IPAGYEVLOBLDELDSLLIIDLGGTTLDISQ 
181 VMGKLSGISKIYGDSSLGVSLVTSAVKDALSLARTKGSSYLADDIIIHRKDNNYLKORIN 
241 DENKISIVTEAMNEALRKLEQRVLNTLNEFSGYTHVMVIGGGAELICDAVKKHTQIRDER 
301 FFKTNNSQYDLVNGMYLIGN 


Residues of ParM that are part of interfaces have been highlighted. Inter- and intra-protofilament interface-forming residues have been highlighted in red and green respectively. These residues have been 
assigned using a 4A distance cut-off based on the ParM+AMPPNP structure. Residues forming the inter-filament interface in the ParM doublet have been highlighted in blue. This assignment was based ona 7A 
distance cut-off for Cx atoms in the derived model of the ParM doublet because of the lower accuracy of the model. 
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Extended Data Table 3 | Instances of single, double and bundled ParM filaments seen in bacterial cells with different copy-number plasmids 


Plasmid type Single filaments Double filaments Bundles Total number of cells imaged Doublets per cell 
High copy 5 35 8 6 5.83 
Medium copy 11 36 2 23 1.56 
Low copy 4 4 0 14 0.28 


The ParMRC locus was inserted into high-, medium- and low-copy-number plasmids (plasmids pDD19, pKG321 and pKG491 respectively). These plasmids were in turn inserted into E. coli cells and imaged using 
cryo-ET. The ratio of observed doublets per cell (that is, the number of doublets observed divided by the number of cells imaged) was 5.8:1.6:0.3 (~19:5:1). These ratios are roughly the same as the expected copy- 
number ratios of the different copy-number plasmids. 


©2015 Macmillan Publishers Limited. All rights reserved 


Mae kd tea 


doi:10.1038/nature14405 


Structures of human phosphofructokinase-1 and 
atomic basis of cancer-associated mutations 


Bradley A. Webb'*, Farhad Forouhar”*, Fu-En Szu’, Jayaraman Seetharaman”, Liang Tong” & Diane L. Barber! 


Phosphofructokinase-1 (PFK1), the ‘gatekeeper’ of glycolysis, cata- 
lyses the committed step of the glycolytic pathway by converting 
fructose-6-phosphate to fructose-1,6-bisphosphate. Allosteric 
activation and inhibition of PFK1 by over ten metabolites and in 
response to hormonal signalling fine-tune glycolytic flux to meet 
energy requirements’. Mutations inhibiting PFK1 activity cause 
glycogen storage disease type VII, also known as Tarui disease’, 
and mice deficient in muscle PFK1 have decreased fat stores’. 
Additionally, PFK1 is proposed to have important roles in meta- 
bolic reprogramming in cancer**. Despite its critical role in glu- 
cose flux, the biologically relevant crystal structure of the 
mammalian PFK1 tetramer has not been determined. Here we 
report the first structures of the mammalian PFK1 tetramer, for 
the human platelet isoform (PFKP), in complex with ATP-Mg”* 
and ADP at 3.1 and 3.4 A, respectively. The structures reveal sub- 
stantial conformational changes in the enzyme upon nucleotide 
hydrolysis as well as a unique tetramer interface. Mutations of 
residues in this interface can affect tetramer formation, enzyme 
catalysis and regulation, indicating the functional importance of 
the tetramer. With altered glycolytic flux being a hallmark of can- 
cers®, these new structures allow a molecular understanding of the 
functional consequences of somatic PFK1 mutations identified in 
human cancers. We characterize three of these mutations and show 
they have distinct effects on allosteric regulation of PFKP activity 
and lactate production. The PFKP structural blueprint for somatic 
mutations as well as the catalytic site can guide therapeutic target- 
ing of PFK1 activity to control dysregulated glycolysis in disease. 

Previous attempts to obtain the structure of mammalian tetrameric 
PFK1 used native protein or recombinant protein generated in yeast or 
bacteria. A limitation of using native PFK1 is that most mammalian 
tissues express all three isoforms: muscle (PFKM), liver (PFKL) and 
platelet (PFKP)’. Although there are structures of PFK from prokar- 
yotes*"' and eukaryotes’? including dimeric rabbit PFKM expressed 
in Escherichia coli”, these structures provide limited information on the 
catalytic interface or the conformational changes with regulation of the 
tetrameic mammalian enzyme. To overcome current limitations with 
structural studies of human PFK1, we produced recombinant PFKP by 
using a baculovirus expression system. The recombinant enzyme, puri- 
fied to homogeneity (Extended Data Fig. 1a), is tetrameric as shown by 
transmission electron microscopy (TEM; Fig. la). The activity and 
regulation of recombinant PFKP, including high cooperativity for 
fructose-6-phosphate (F6P), a high affinity for ATP-Mg*", and high 
sensitivity to ATP inhibition (Extended Data Fig. 1b, c), was similar to 
previously reported mouse PFKP expressed in yeast". 

We determined the crystal structure of the PFKP tetramer in com- 
plex with ATP-Mg”* at 3.1 A resolution (Fig. 1b-d and Extended Data 
Fig. 2). The atomic model has good agreement with the crystal- 
lographic data and the expected geometric parameters (Extended 
Data Table 1). The asymmetric unit contained two tetramers, and the 
eight protomers have essentially the same conformation (with r.m.s.d. 


of ~0.3 A between any pair of them, Extended Data Fig. 3). The overall 
organizations of the two tetramers are slightly different, reflected in part 
by changes in the relative orientations of the two dimers (Extended 
Data Fig. 3). 

Each PFKP tetramer measures 13.8 nm by 10.3 nm, similar in size 
and shape to what we calculated from TEM images (Fig. la, b). The 
tetramer is composed of a dimer of dimers, and the interface between 
the two dimers is relatively small, with a buried surface area of 700 A? 
for each subunit (arrow labelled ‘t’ in Fig. 1b, c). The two subunits of 
the dimer are arranged in an antiparallel orientation, confirming pre- 
vious predictions’®, with a buried surface area of 1,800 A? for each 
subunit. The active site is located at the interface between the two 
subunits (arrow labelled ‘c’ in Fig. 1b, e). 

The structure of PFKP is likely to represent the active conformation 
of the enzyme. The crystal was prepared at pH 7, near physiological 
pH, and residues in the active site that are important for substrate 
binding and/or catalysis have similar conformations in PFKP as in 
other PFK structures (Fig. le). The F6P substrate, as observed in the 
Saccharomyces cerevisiae PFK (ScPFK) structure’, can be readily 
accommodated in the PFKP active site for catalysis. The invariant sub- 
strate binding residues His208 and Arg210 from the second protomer of 
the dimer are located ~6 A away from F6P, suggesting that a closure of 
this region of the active site may occur upon F6P binding and catalysis. 
PFKP contains only one ATP in each subunit bound to the active site, 
despite the presence of 10 mM ATP during crystallization and even 
though the allosteric adenine nucleotide-binding sites are functional 
(Extended Data Fig. 1b, c). We also observed the binding of two phos- 
phate groups in each protomer at positions corresponding to the pro- 
karyotic PFK effector sites (Extended Data Fig. 2c, d). The enzyme 
activity, regulation and stability of PFK1 are controlled by binding 
phosphate or sulphate ions’”"*. PFKP displayed a loss of ATP inhibition 
in the presence of 10 mM sodium sulphate (Extended Data Fig. 2e), 
suggesting that phosphate-binding and inhibitory site ATP-binding are 
mutually exclusive in the tetrameric structure. 

We also determined the crystal structure of PFKP in complex with 
ADP at3.4A resolution, at pH 7.5. The relatively low resolution of this 
structure precludes a detailed structural comparison with that of the 
ATP-Mg~" complex. However, it is clear there is a dramatic change in 
the relative positions of the two domains in each protomer (Fig. 2a), 
and especially the overall structures of the dimer and tetramer 
(Fig. 2b). A rotation of ~12° is observed between the subdomains of 
the ADP complex protomer relatively to the ATP complex, leading to 
an 8 A shift in the substrate binding domain relative to the nucleotide 
binding domain. An effect of this conformational change is to open the 
catalytic site (Fig. 2c, d), which may play a role in the release of pro- 
ducts. The conformational changes observed between the ATP and 
ADP complexes of PFKP are different from those seen for the R- and 
T-states of bacterial PFK (Extended Data Fig. 4)*"’. 

We tested the importance of hydrophobic and electrostatic interac- 
tions at the tetramer interface for enzyme activity (Fig. 3a). Most 
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Figure 1 | Structure of ATP-bound tetrameric PFKP is in the active 
conformation. a, TEM images of PFKP. Left: scale bar, 50 nm. Right: indicated 
dimensions are the mean ~ s.d. of 37 individual particles. b, c, Ribbon diagrams 
of PFKP displaying the relative orientation of PFKP tetramer subunits. Each 
subunit is individually coloured. Arrows labelled ‘c’ and ‘t indicate the catalytic 
and tetramer interfaces, respectively. View in c is rotated 90° from b. d, View 
rotated 90° from c displaying the catalytic sites. The front subunits are shown in 
ribbon representation and the rear subunits depicted as surface models. ATP, 
black; Mg”*, dark green; phosphate, yellow. e, The binding mode of ATP- 
Mg’™ at the active site of PFKP. The binding mode of the F6P substrate in 
ScPFK” is also shown. 


residues at the interface are hydrophobic. Tyr645 and Phe649 from the 
two subunits form a 1-stack of four aromatic side chains in the inter- 
face, with Phe649 in the middle (Fig. 3a). Despite the overall similar- 
ities in organization between the PFKP tetramer to that of ScPFK”, 
there are significant differences in the tetramer interface between the 
two enzymes (Extended Data Fig. 5). Phe649, which is evolutionarily 
conserved in metazoans but not in yeasts, is a leucine residue in ScPFK 
a-subunit. However, this Leu residue has a completely different local 
environment in ScPFK compared with Phe649 in PFKP. We generated 
recombinant PFKP with Phe649 mutated to Leu (Extended Data 
Fig. 6a) to test whether Phe649 is required for tetramer formation. 
Previous studies showed that PFK1 assembles into tetramers in a 
concentration- and ligand-dependent manner, with allosteric activa- 
tors favouring the formation of tetramers and allosteric inhibitors 
favouring the formation of dimers”. In a buffer containing ADP, 
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Figure 2 | PFKP undergoes a large conformational change upon ATP 
hydrolysis. a, Structural overlay of ATP-bound (cyan) and ADP-bound (grey) 
PFKP subunits. b, Structural overlay of ATP-bound (coloured ribbon) and 
ADP-bound (grey surface) PFKP tetramers. Arrows labelled ‘c and ‘t represent 
the catalytic and tetramer interfaces, respectively. c, d, Conformation of the 
active site in the ATP-Mg” *-bound (c) and ADP-bound (d) structures. 


ATP and F6P, TEM showed that wild-type (WT) PFKP particles had 
the dimensions and appearance of tetramers (Fig. 3b, c). In contrast, 
PFKP-F649L particles were the same width but half the length of WT, 
consistent with dimer formation along the catalytic interface (Fig. 3b, c 
and Extended Data Fig. 6b). We compared the PFKP-F649L particles 
with those induced by the inhibitor citrate, which cause PFKM to form 
dimers”. In a buffer containing 1 mM citrate, we saw two sizes of 
particles with WT PFKP: one with dimensions of tetramers and the 
other with dimensions of dimers along the catalytic interface (Fig. 3b, c 
and Extended Data Fig. 6c), further confirming dimer formation by 
PFKP-F649L. The catalytic activity of PFKP-F649L was reduced 98% 
(Extended Data Fig. 6d) compared with WT enzyme, indicating that 
tetramer formation is necessary for PFK1 activity. 

The structures suggest that an electrostatic interaction at the tetra- 
mer interface between Arg613 of one subunit and Glu657 of the adja- 
cent subunit (Fig. 3a) may be important for enzyme function. This salt 
bridge was only observed in the ATP-bound structure but not in the 
ADP-bound PFKP structure or dimeric rabbit PFKM structures’, 
suggesting that it may contribute to maintaining an active form of 
the mammalian tetramer. PFKP-E657A had reduced affinity of 
~4.5 mM for F6P, compared with ~0.8 mM in WT, and an approxi- 
mately twofold decrease in maximum activity (Fig. 3d, Extended Data 
Fig. 6a and Extended Data Table 2). Our data indicate that hydro- 
phobic interactions are essential for the formation of tetramers while 
electrostatic interactions are required for optimal enzyme activity. 

The structure of PFKP provides a foundation for understanding the 
functional effects of somatic PFK1 mutations identified in cancers. 
Cancer cells rely on aerobic glycolysis to provide energy and cellular 
building blocks required to support rapid proliferation’. PFK1 activity 
is increased in cancer cell lines and primary tumour tisues* and 
expression of PFKP is upregulated in breast” and liver*® cancers. 
The effect of somatic mutations in PFK1 on metabolic adaptation 
has not been reported. We mapped the 44 reported somatic mutations 
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Figure 3 | Interactions at the tetramer interface of PFKP regulate enzyme 
activity. a, Interface of the PFKP tetramer, with one of two tetramer interfaces 
involving residues from subunit A (cyan) and subunit D (magenta). Two views 
of the hydrophobic interactions at the tetramer interface and predicted 
electrostatic interactions between Arg613 of one subunit with Glu657 of the 
adjacent subunit. Arrows indicate the position of the two-fold symmetry axis in 
the tetramer interface, relating the two subunits. b, TEM images of WT and 
PFKP-F649L PFKP particles in buffer containing activator and substrates or 
WT in buffer containing the inhibitor citrate. c, Scatter plot of length versus 
width for particles observed in TEM. WT with activator and substrate (black, 
n = 53); F649L with activator and substrate (blue, n = 77); WT with citrate, 
tetramers (green, n = 76) and dimers (red, n = 41). d, F6P dependence of 
PFKP WT (black squares) and E657A (grey circles) at 0.25 mM ATP. Data are 
means + s.e.m. of eight (WT) and five (E657A) determinations from two 
independent protein preparations. 


in cancers’ that were not associated with single nucleotide poly- 
morphisms” onto the structure of PFKP (Fig. 4a and Extended Data 
Table 3). Analysis by Mutation Assessor” predicted that 28 of these 
mutations would alter enzyme activity. 

We selected three identified somatic mutations for biochemical 
analysis (Extended Data Fig.7). Arg48 interacts with a bound phos- 
phate ion in the structure (Fig. 4b), and the R48C mutant had reduced 
citrate inhibition, shifting E5o“""* from 0.4 mM for WT to greater 
than 4 mM (Fig. 4d) but did not markedly change effects of ATP and 
F6P (Fig. 4e, f and Extended Data Table 2). Analogous mutations in 
PFKM have been described in Tarui disease*®*®. These data indicate 
that Arg48 is located in the citrate-binding site, which is occupied by 
the phosphate ion in the current structure*’. A serine substitution for 
Asn426, located close to the catalytic interface, is predicted to disrupt 
interactions with the backbone carbonyls of Gln472, Gly473 and 
Gly474 and the main-chain amide of Ile476, which are involved 
in positioning a loop at the catalytic interface (Fig. 4c). The N426S 
mutant partly relieves ATP inhibition, shifting EC;)*"” from ~1 mM 
to greater than 3 mM (Fig. 4e). Located across the catalytic interface 
from Asn426, Asp564 forms an electrostatic interaction with Arg319 
(Fig. 4b). The D564N mutant had decreased maximum velocity and 
affinity for F6P (Fig. 4f and Extended Data Table 2). We also stably 
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Figure 4 | Somatic cancer mutations of PFKP alter enzymatic activity and 
allosteric regulation. a—c, Location of indicated PFKP mutations in human 
cancers identified from the COSMIC database and mapped onto the catalytic 
interface of the PFKP subunit. Mutations chosen for further analysis are 
denoted with coloured boxes, including Arg48Cys (a, red) and location of 
Arg48 at PO,” -binding site (b), Asp564Asn (a, green) and ionic bond of 
Asp564 with Arg319 (b), and Asn426Ser (a, blue) and location of Ans425 at the 
catalytic interface (c). d-f, The effect of mutations on citrate inhibition (d), ATP 
activation and inhibition (e) and affinity for F6P (f). Data are means + s.e.m. of 
seven (d), five (e) and seven (f) determinations from two independent protein 
preparations. WT, black circles; Arg48Cys, red squares; Asn426Ser, blue 
triangles; Asn564N, green triangles. g, Lactic acid excretion (micromoles of 
lactate excreted per hour per microgram of total cell lysate) from MTLn3 rat 
mammary adenocarcinoma cells expressing WT and mutant PFKP-GFP. Data 
are means + s.e.m. of four experiments. *P < 0.05; **P < 0.01. 


expressed PFKP WT and mutants tagged with green fluorescent pro- 
tein (GFP) in MTLn3 rat mammary adenocarcinoma cells (Extended 
Data Fig. 7b). In cell lysates, PFK1 activity was greater with expression 
of WT, N426S and D564N but not R48C compared with untransfected 
or GFP controls (Extended Data Fig.7c). Lactic-acid excretion was also 
greater with cells expressing WT and N426S but significantly less with 
D564N compared with GFP controls (Fig. 4g). Inhibition of glycolytic 
flux by loss of function mutations, such as D564N, may confer a 
selective advantage for cancer cell growth and metastasis by redirecting 
carbon flow through the pentose phosphate pathway, similar to that 
observed by glycosylation-dependent PFK1 inhibition*. However, 
relief of inhibition by allosteric regulators had no effect on lactate 
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production in the glutamine-free cell-culture conditions we used. This 
finding could reflect the ability of PFK1 to dynamically alter metabolic 
states by integrating multiple signals. Additionally, the functional sig- 
nificance of selective PFKP mutations will depend on the mutational 
signature of the respective cancer in which they occur as well as the 
relative expression of other PFK1 isoforms. 

In addition to cancer, aberrant glycolytic flux is increasingly recog- 
nized as contributing to several other diseases such as obesity, diabetes 
and Tarui disease. The biologically relevant tetrameric structures of the 
PFKP provide information on the catalytic interface and conforma- 
tional changes upon ATP hydrolysis that contributes to a mechanistic 
understanding of the functional impact of disease-associated muta- 
tions. Additionally, these new structural insights will enable rational 
drug design for therapeutic development. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. 
Cloning, expression and purification of recombinant human PFKP. Homo 
sapiens PFKP complementary DNA (cDNA) (NM_002627.4) encoding the 784- 
amino-acid isoform 1 was cloned into the pFastBac HTa vector and baculovirus 
was generated using the Bac-to-Bac Expression system (Invitrogen) according to 
the manufacturer’s protocols. Two billion sf21 or Hi5 cells were used to express 
PFKP at a multiplicity of infection of 1 for 48 h. Cell pellets were resuspended in 
lysis buffer (20 mM tris(hydroxymethyl)aminomethane (Tris-HCl; pH 7.5); 50 
mM potassium phosphate; 1 mM 2-mercaptoethanol; 10% glycerol; 10 mM imi- 
dazole; cOmplete Protease Inhibitor Cocktail tablet (Roche)) and lysed with 15 
passes of a dounce homogenizer. Cell debris was removed by centrifugation and 
the pellet discarded. The supernatant was incubated with Talon resin (Clontech), 
washed with 20 bed volumes of lysis buffer and eluted with a minimal volume of 
elution buffer (lysis buffer with 100 mM imidazole). Protein was concentrated 
using an Amicon Ultracel-30K Centrifugal Filter Unit (Milipore) and buffer 
exchanged into FPLC buffer (20 mM HEPES, pH 7.5, 100 mM KCl, 1 mM 
TCEP, 1 mM ATP, 1 mM MgCh, and 5% glycerol). PFKP was passed over a 
Superose 6 10/300 GL column (GE Healthcare) and the peak corresponding to 
the tetrameric fraction collected. Buffer was exchanged to crystallization buffer (20 
mM HEPES, pH 7.5, 100 mM KC, 1 mM TCEP, 10 mM MgCl, and 5% glycerol) 
containing either 10 mM ADP or 10 mM ATP using an Amicon Ultracel-30K 
Centrifugal Filter Unit and recombinant PFKP concentrated to >5 mg ml. 
Protein was stored at 4 °C. Recombinant PFKP was tested for activity and allosteric 
regulation before crystallization. 
PFK1 activity assays. Activity assays for PFK1 were preformed using an auxiliary 
enzyme assay”'. Kinetic studies were performed in 200 pl reaction containing 50 
mM HEPES pH 7.4, 100 mM KCl, 10 mM MgCl, 0.15 mM NADH, 0.675 units 
ml”! aldolase, 5 units ml~' triosephosphate isomerase and 2 units ml~! glycerol 
phosphate dehydrogenase. ATP and F6P were used as indicated. Auxiliary 
enzymes were de-salted using an Amicon Ultracel-10K Centrifugal Filter Unit 
before use. The concentration of PFKP was normalized and samples diluted as a 
10X stock in 10% glycerol, 20 mM Tris-HCl (pH 7.5) and 1 mM DTT immediately 
before the assay. The temperature was equilibrated to 25 °C for 10 min before 
initiating the reaction with the addition of PFKP. The absorbance at 340 nm was 
measured using a SpectraMax M5 microplate reader (Molecular Devices). Kinetic 
parameters were generated by linear regression analysis of the Hill equation using 
Prism (GraphPad Software) and are the average of a minimum of three measure- 
ments from two independent preparations of protein (R* > 0.95 for all analyses). 
An unpaired t-test with equal variance was used to compare the activity of WT and 
F649L PFKP. One unit (U) of activity is defined as the amount of enzyme that 
catalyses the formation of 1 fmol of fructose-1,6-bisphosphate per minute at 25 
°C. Data on the effect of sulphate on PFK1 activity were obtained in the presence of 
10 mM sodium sulphate or 10 mM sodium chloride as a control. 
Transmission electron microscopy. Twenty microlitres of 25 ig ml ' PFKP was 
applied to glow-discharged carbon-coated grids and stained with 2% (w/v) uranyl 
acetate. Grids were examined and photographed with a JEOL 100CX II. For 
estimation of size of PFKP dimers and tetramers, the length and width of indi- 
vidual particles from TEM images were measured using FIJI ImageJ software’. 
The average length and width + s.d. are reported. For experiments analysing the 
shape and size of PFKP for crystallography studies, the protein was diluted in TEM 
buffer (20 mM HEPES, pH 7.5, 100 mM KCl, 1 mM DTT, 1 mM ATP, 1 mM 
MgCl, and 5% glycerol). For experiments analysing the oligomeric state of the 
enzyme in the presence of activators, WT and F649L PFKP were diluted in TEM 
buffer containing 3 mM ADP, 3 mM ATP and 8 mM FéP. For experiments 
analysing the oligomeric state of the enzyme in the presence of inhibitors, PFKP 
was diluted in TEM buffer containing 1 mM citrate. 
Crystallization and structure determination. PFKP was crystallized in two dif- 
ferent complexes of ADP and ATP-Mg** by a microbatch method at 18 °C. For 
the ADP complex, 2 pl of protein solution containing PFKP (6.35 mg ml’) was 
mixed with 1 pl of the precipitant solution consisting of 200 mM potassium 
sodium tartrate tetrahydrate, pH 7.4, and 20% (w/v) PEG 3350. The same protein 
buffer was used for growing the crystals of the PFKP in complex with ATP and 
Mg’*. The crystals were obtained using microbatch method and the precipitant 
solution comprising 200 mM potassium thiocyanate, pH 7, and 20% (w/v) PEG 
3350. All crystals were cryoprotected by addition of 20% (v/v) ethylene glycol in 
the respective mother liquor and flash-frozen in liquid nitrogen for data collection 
at 100 K. 

Crystals of the PFKP complexes both belong to space group P2,. However, the 
crystallographic asymmetric unit of the ADP-bound form contains four subunits 
of PFKP that are assembled as one tetramer, whereas that of ATP-Mg"* form 
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contains two tetramers. A single-wavelength native data set to resolution 3.1 A was 
collected at the X4C beamline of the National Synchrotron Light Source. The 
diffraction images were processed with the HKL package*’. The structure of 
PFK from rabbit skeletal muscle (PDB accession number 308L)” was used to 
determine the ATP-Mg”~ structure of PFKP using the molecular replacement 
method, with the program MolRep**. Only a monomeric model of PFK from 
rabbit skeletal muscle resulted in a solution, which led to structure determination 
of the ATP-bound PFKP structure. The remainder of the PFKP model was built 
manually with the program XtalView. The structure refinement was performed 
with CNS**. A similar methodology was used for data collection and processing of 
the ADP-bound structure of PFKP, the crystal of which diffracted to 3.4 Aat the 
X4C beamline of the National Synchrotron Light Source. The ADP-bound struc- 
ture was subsequently determined using a monomeric model of the ATP-bound 
complex of PFKP, with the program MolRep™ followed by structure refinement by 
CNS”**. The data processing and refinement statistics are summarized in Extended 
Data Table 1. The Ramachandran plots suggest that 88.1% and 81.9% of residues 
in ATP-bound complex and ADP-bound complex of PFKP are in most favoured 
regions, and there is no residue in disallowed regions, respectively. The trajectory 
between the ATP-Mg**-bound and the ADP-bound structures was generated 
using UCSF Chimera’’. The structures were aligned with Matchmaker tool and 
the trajectory calculated with the Morph Conformation tool. 

Selection of cancer mutations and generation of point mutants. Somatic muta- 
tions identified in human cancers were selected from the COSMIC database” and 
known single nucleotide polymorphisms were disregarded”*. The mutations were 
modelled onto the structure of PFKP and selected for further analysis. Point 
mutants at the tetramer interface, F649L and E657A, and cancer mutants, 
R48C N426S, and D564N, were generated by using a commercially available 
site-directed mutagenesis kit (QuikChange Lightning, Aligent). DNA primers 
were designed using the online primer design tool (http://www.genomics.agilent. 
com/primerDesignProgram.jsp) and purchased from Elim Biopharmaceuticals. 
Analysis of cells expressing PFKP. A mammalian PFKP expression construct 
was generated by PCR amplification and the cDNA inserted into the multiple 
cloning site of pEGFP-N1 using the restriction enzymes XhoI and BamHI. Cancer 
mutations were generated by site-directed mutagenesis as described above. 
Constructs were expressed by transfecting MTLn3 rat mammary adenocarcinoma 
cells*® using FugeneHD (Promega) transfection reagent. One-day post-transfec- 
tion cells were re-plated into 100 mm dishes and 800 pg ml ' G418 was added to 
select for transfected cells. After 1 week of selection, fluorescence-activated cell 
sorting was used to sort cells expressing GFP. For metabolic assays, cells were 
seeded into a six-well plate at a density of 3 X 10° cells per well. One day after re- 
plating, cells were washed twice in serum- and glutamine-free media, and cells 
were incubated for 2 h in 1 ml of the same media. Fifty microlitres of the media 
were collected in triplicate and the amount of lactic acid in the media measured 
using an enzyme-linked assay”. One hundred microlitres of reagent A (300 mM 
hydrazine; 200 mM glycine, pH 9.5; 20 mM {-nicotinamide adenine dinucleotide) 
and 50 pl of reagent B (200 U ml L-lactate dehydrogenase from rabbit muscle 
(Sigma Aldrich) were added to each well and incubated for 1 h at 22 °C. The 
absorbance at 340 nm was measured and the amount of lactate was determined 
from a standard curve. Cells were lysed in buffer (10 mM potassium phosphate, 
pH 7.5; 0.1% Triton X-100; cOmplete Protease Inhibitor Cocktail tablet (Roche)), 
cellular debris removed by centrifugation and the protein concentration deter- 
mined by the Bradford method. Lactic-acid levels in the media were normalized to 
protein concentration. PFK1 activity assays were performed on the lysate as prev- 
iously described”. Enzyme-linked PFK1 activity assays were performed on 10 1g 
of total cell lysate as described above with the exception that 10 mM ammonium 
sulphate was added to the assay mixture and the auxiliary enzymes were not de- 
salted. Levels of PFKP expression were determined by immunoblotting using 
rabbit anti-GFP (Invitrogen A-11122, 1DB-001-0000868907) and mouse anti- 
actin clone C4 (ED Millipore MAB1501, 1DB-001-0000850281) antibodies. 
Two-sided paired t-tests were used to determine statistical significance. 
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Extended Data Figure 1 | Activity of purified recombinant PFKP. 3 mM ATP (filled grey circles), 0.25 mM ATP and 0.25 mM ADP (open black 


a, Coomassie-stained SDS-polyacrylamide gel electrophoresis (SDS-PAGE) of _ squares), and 3 mM ATP and 3 mM ADP (open grey circles). ¢, Effect of ADP 
purified PFKP. The molecular mass (MM) of protein standards is shown in on kinetic behaviour of PFKP in the presence of 0.25 mM (black squares) or 
kilodaltons (kDa). b, Allosteric regulation of PFKP by ATP and ADP; F6P 3 mM (grey circles) ATP. Data in b and c are means + s.e.m. of ten (b) or five 
saturation curve PFKP in the presence of 0.25 mM ATP (filled black squares), _ (c) determinations from two separate protein preparations. 
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Extended Data Figure 2 | Structure, nucleotide binding and phosphate ion 
binding of PFKP. a, The structure of PFKP protomer can be divided into two 
halves: the amino (N)-terminal (cyan) and the carboxy (C)-terminal (yellow) 
subdomains. The N terminus of each subdomain begins with a nucleotide- 
binding domain (NBD) followed by a smaller substrate-binding domain (SBD). 
Each NBD closely resembles a canonical Rossmann fold composed of a seven- 
stranded B-sheet surrounded by six «-helices. Each SBD consists of a four- 
stranded B-sheet surrounded by five o-helices. Two phosphate ions (stick 
drawings) are bound in pockets equivalent to the effector binding sites of 

the E. coli PFK. b, Final 2F, — F, electron density at 3.1 A resolution for 
ATP-Mg’*, contoured at 1a. A strong electron density is observed near B- and 
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y-phosphate of the nucleotide, which was unambiguously modelled as Mg”* 
ion. An extended but weaker electron density is also observed near the 
y-phosphate of the nucleotide, which is surrounded by three backbone 
carbonyls of strictly conserved Ser32, Gly34 and Gly172. This electron density 
was modelled as a second metal ion, although it may belong to a water molecule. 
c, d, Structure of the two inorganic phosphate-binding sites in PFKP. e, Plot of 
concentration of ATP versus relative enzymatic activity of PFKP in the 
presence (black triangles) and absence (grey circles) of 10 mM sodium sulphate. 
Activity is expressed relative to maximal activity at this pH and F6P 
concentration. Data are means ~ s.e.m. of three determinations. 
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Extended Data Figure 3 | Structural comparison of the two PFKP tetramers —_ arrows. b, Overlay of the two PFKP tetramers. A noticeable difference is the 
in the ATP-Mg”* complex. a, Overlay of the structures of the eight PFKP twisting of the second dimer in the two tetramers, indicated with the red arrow. 
subunits. Only two loops show substantial differences, indicated with the red 
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Subunit B 


Extended Data Figure 4 | Structural comparison of ATP- and ADP-bound 
PFKP with R- and T-state of E. coli PFK. a, Structural overlay of ATP-bound 
(coloured) and ADP-bound (grey) PFKP. For comparison with structures of E. 
coli PFK, the N- and C-terminal subdomains of PFKP are coloured cyan and 


blue for subunit A, and yellow and orange for subunit B. b, The view in a is 
slabbed to highlight the difference between the two structures. c, Structural 
overlay of R-state (coloured; PDB accession number 4PFK)"' and T-state (grey; 
PDB accession number 6PFK)”* of E. coli PFK. d, The view in c is slabbed. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a H. sapiens PFKP 639 ESCSENYTTDFIYOQLYSEEGK 659 
H. sapiens PFKM 629 EKCNENYTTDFIFNLYSEEGK 649 
H. sapiens PFKL 628 EKCHDYYTTEFLYNLYSSEGK 648 


S. cerevisiae alpha 827 EQASSVYSTOLLADIISEASK 847 
S. cerevisiae beta 821 TNASKALSATKLAEVITAEAD 841 


Extended Data Figure 5 | A unique tetramer interface in PFKP. orientation as PFKP. The tetramer interface is highlighted in the red box. 

a, Alignment of residues from PFKP surrounding Phe649 (arrow) with human _d, Structure of rabbit PFKM”. e, Stereo drawing of the overlay of the tetramer 
PFKM and PFKL and S. cerevisiae PFK1 «- and B-subunits. b, Structure of interface of PFKP (in colour) and ScPFK. 

PFKP tetramer. c, Structure of ScPFK tetramer’’, viewed roughly in the same 
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Extended Data Figure 6 | Purification and TEM analysis of PFKP tetramer 
mutants. a, Coomassie-stained SDS-PAGE of PFKP F649L and E657A. 

b, c, TEM images of PFKP F649L (b) in buffer with activator and substrates 
(3 mM ADP, 3 mM ATP and 8 mM F6P) and WT PFKP (c) in buffer 
containing inhibitor (1 mM citrate). Red arrows indicate dimers. Scale bar, 
50 nm. d, Activity of WT PFKP and PFKP-F649L in buffer containing 3 mM 
ADP, 3 mM ATP and 8 mM FéP. Data are means + s.e.m. of six (WT) and 
nine (F649L) determinations from two independent protein preparations 

(P < 0.001). 
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Extended Data Figure 7 | Purification of PFKP cancer mutants and their 
activity in cells. a, Coomassie-stained SDS-PAGE of purified recombinant 
PFKP mutants R48C, N426S and D564N. b, Immunoblot of GFP and actin 
from total cell lysates of MTLn3 rat mammary adenocarcinoma cells expressing 
PFKP-GFP. Blots are representative of three experiments from individual 
preparations of cells. c, PFK1 activity (micromoles of F1,6bP produced per 
minute per nanogram of total cell lysate) was measured in five independent 
preparations of cells. A two-sided paired t-test was used to determine 
significance. **P < 0.01; ***P < 0.001. 
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Extended Data Table 1 | Data collection and refinement statistics 


PFKP PFKP 
ATP-Mg2* complex ADP complex 

Data collection 
Space group P2, P2, 
Cell dimensions 

a,b,c (A) 137.2, 159.3, 170.5 79.3, 168.4, 133.3 

a, B. 7 (°) 90, 104.2, 90 90, 103.8, 90 
Resolution (A) 50-3.1 (3.23.1) * 45.4-3.4 (3.5-3.4) ” 
Ronenge 11.7 (68.4) 15.6 (53.4) 
Ifol 13.9 (1.8) 4.3 (1.2) 
Completeness (%) 93.0 (84.1) 84.8 (74.3) 
Redundancy 6.4 (5.3) 2.1 (1.9) 
Refinement 
Resolution (A) 50-3.1 (3.3-3.1) 45.4-3.4 (3.6-3.4) 
No. reflections 109,577 (10,980) 34,657 (3,486) 
Ryores Rice 22.8/25.8 24.2/29.4 
No. atoms 47,148 23,644 

Protein 46,792 23,500 

Ligand/ion 316 144 

Water 40 0 
B-factors 

Protein 62.8 745 

Ligand/ion 45.6 66.1 

Water 22.3 
R.m.s deviations 

Bond lengths (A) 0.009 0.011 

Bond angles [(* 1.3 1.2 


One crystal was used for data collection for each structure. 
* Highest resolution shell is shown in parenthesis. 
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Extended Data Table 2 | Saturation kinetics on WT and mutant PFKP 
Parameter Wildtype E6S7A  R48C N426S 


Maximum Velocity 59.27 32.29 958.19 67.41 


Sosfe? (mM) 0.83 451 0.84 0.82 
ny (mM) 3.41 413 2.94 3.64 
ECzo'"? (mM) 0.96 ND 1.19 23 
ECsoStrte (mM) 0.40 ND >4 0.31 


LETTER 


DSG64N 
30.60 
2.04 
3.18 
0.68 


1.40 


Kinetic properties of WT and mutant protein were determined by modelling the sigmoidal part of the curve (Vinin to Vinax) to the Hill equations. Assays were performed at pH 7.4 with 0.25 mM ATP. For F6P affinity, 
assays were performed at pH 7.4 with 0.25 mM ATP. ATP and citrate inhibition assays were performed at pH 7.4 with 2 mM F6P (WT, R48C and N426S) or 4 mM F6P (D564N). Citrate inhibition assays were 
performed with 0.25 mM ATP. ND, not determined. 
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Extended Data Table 3 | Somatic mutations of PFKP in cancer 


Missence Mutation* Mutation ID* Ligand Predicted Impact Tissue/Cancer Type* 
Interactions** on Activi 
$32R 1603388 ADP ATP High Liver Carcinoma 
R48H 917671 PGA ADP Medium Endometrium Carcinoma 
M49L 1675020 Medium Lung Carcinoma 
151V 538834 Medium Lung Carcinoma 
D0128G 1492251 F1,6bP ADP ATP High Kidney Carcinoma 
G129W 1187879 ADP ATP High Lung Carcinoma 
Li3it 330744 ATP High Lung Carcinoma 
Q1S3K 1236264 Neutral Autonomic ganglia Neuroblastoma 
ALS8Y 917677 Low Endometrium Carcinoma 
D17S5Y 1347563 ATP F6P F16bP High Large Intestine Carcinoma 
R219Q 1603390 ATP ADP F6P F1,6bP High Liver Carcinoma 
£245Q 684504 ATP ADP Low Lung Carcinoma 
R262Q 1347567 PGA ADP Low Large Intestine Carcinoma 
E286K 1474583 Medium Breast Carcinoma 
¥2931 255484 Medium Primitive neuroectodermal tumour - 
medullodlastoma 
R301H 1347571 FOP High Large Intestine Carcinoma 
V308M 917679 Medium Endometrium Carcinoma 
E328V 1603392 Medium Liver Carcinoma 
A332T 291604 Medium Large Intestine Carcinoma 
P407S 233090 Neutral Skin Malignant melanoma 
A4t4D 1347575 High Large Intestine Carcinoma 


—s 
on 


um Care 


AMAST 1220252 Medium Large Intestine Carcinoma 


W463C 370998 PGA ADP High Lung Carcinoma 
G467A 332540 PGA ADP Neutral Lung Carcinoma 
T4701 1702000 Neutral Skin Malignant melanoma 
A492T 1675022 F1,60P High Lymphodd neoplasm 
AS37S 26929 Low Lung Carcinoma 


Large Intestine Carcinoma 


R575Q 1220254 Low Large Intestine Carcinoma 


AGO3T 1492249 Medium Kidney Carcinoma 

K627E 1347584 PGA ADP Low Large Intestine Carcinoma 

K627N 72181 PGA ADP Medium Ovary Carcinoma 

D648N 1560809 Low Large Intestine Carcinoma 

N667Y 1347586 Medium Large Intestine Carcinoma 

P6B0A 917689 Low Endometrium Carcinoma 

1689M 112806S Neutral Prostate Caranoma 

E£703D 684499 Medium Lung Carcinoma 

K7091 255302 Low Primitive neurocctodermal tumour - 
medullodlastoma 

T713A 1239809 Low Oesophagus Carcinoma 

E734G 98079 Low Upper aerodigestive tract Carcinoma 

M7S8I 117577 Medium Ovary Carcinoma 

L761P 538526 Medium Lung Carcinoma 


Residues highlighted in grey were chosen for further characterization. 
* Missense mutations identified, mutation identification number and the tissue/cancer type each mutation was identified from COSMIC database*?. 
** Ligand interactions and predicted impact on activity obtained from Mutation Assessor**. Mutations with ‘High’ or ‘Medium’ impact are predicted to alter enzyme activity. 
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ILLUSTRATION BY THE PROJECT TWINS 


COMPUTERS READ 
THE FOSSIL RECORD 


Palaeontologists hope that software can construct fossil 
databases directly from research papers. 


BY EWEN CALLAWAY 


the deep past, palaeontology is remark- 

ably forward-looking when it comes to 
organizing its data. Victorian natural history 
museums meticulously organized their col- 
lections with handwritten cards that survive to 
this day. And over the past 15 years, researchers 
have collectively entered records of more thana 
million fossils into an online database, allowing 


Ns a field whose raison détre is to chronicle 


them to track broad trends in the history of life. 
Now, palaeontologists are exploring the use of 
machine algorithms to pull fossil data from their 
research papers automatically. 

“Tm fairly convinced that this is the future, 
for sure,’ says Shanan Peters, a palaeontologist 
at the University of Wisconsin-Madison (UW 
Madison) who is co-leading an effort to use 
software to extract information from tens of 
thousands of palaeontology papers. “Building 
a database, per se, will be a thing of the past. 
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Those databases will be dynamically generated 
based on the questions youre interested in, and 
the machine will do the heavy lifting” 

Peters should know. He is the principal inves- 
tigator of the Paleobiology Database (PBDB; 
paleobiodb.org), which details the age, loca- 
tion and identity of some 1.2 million fossils. 
Since it was started in 1998, researchers have 
spent about 80,000 hours — the equivalent of 
9 continuous years — entering and opining 
over data from original field research and 
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> around 40,000 articles. The PBDB has pro- 
duced hundreds of papers and has allowed pal- 
aeontologists to address questions that would 
have been otherwise unanswerable, on topics 
ranging from epoch-wide extinction rates to 
the disappearance of certain dinosaurs. 

The PBDB is a database created by experts: 
around 380 scientists have uploaded some 
560,000 published opinions on the classifica- 
tions of 320,000 taxonomic names. But Peters 
was curious to know whether such a database 
could be compiled automatically by computer. 
Soin 2013 he started a collaboration with Miron 
Livny and Chris Ré, then data scientists at UW 
Madison (Ré has since moved on to Stanford 
University in California). Ré had developed 
software called DeepDive, which mines writ- 
ten text (such as words in a research paper) 
and pulls out facts. Text mining — or content 
mining — is now a commonplace tool in com- 
puter science and is slowly beginning to find 
uses in research fields from genomics to drug 
discovery. Text mining palaeontology literature 
appealed to Ré, partly because the PBDB offers 
a human-curated database with which to com- 
pare a computer-generated counterpart. 


PARSING THE PAST 

DeepDive begins by parsing research papers 
in a manner that would be familiar to anyone 
who remembers their early grammar lessons. 
“Tt’s taking those papers and converting them 
into text,’ says Ré: it is trying to determine the 
answer to questions such as, “What's a noun, 
what's a verb and how do you diagram a sen- 
tence?” Next, DeepDive attempts to predict 
the concepts that are stored in those sentences 
(such as, for palaeontology, the names of fos- 
sils and the places where they were found) and 
assigns a probability to each assertion. The 
result is software “which is usually imperfect 
in alot of ways’, says Ré. “That's where you get 
the domain scientist involved.” 

Peters spent about a year refining the first- 
pass software so that, for instance, it knows 
where to look in palaeontology papers for 
the names of new species and the geographic 
locations in which they were discovered. Ré 
describes this process as a “back and forth’ with 
Peters that required Ré’s team of data scientists 
to come up with custom computing solutions to 
make the requests feasible. “I would love to say 
the answer is people can press a button and use it 
and run it and they don't need us,’ Ré says — but 
that is a goal that his team has not yet reached. 

Asa proof of principle, Peters and Ré used 
custom software that they called PaleoDeep- 
Dive to create a text-mined, scaled-down ver- 
sion of the PBDB that incorporated around 
12,000 papers. In some ways the computer- 
generated database outshines the PBDB, Peters 
says, because all the information in it comes 
with a probability assigned to it and is linked 
back to the original text. “The machine is really 
clear about uncertainty, when there's ambiguity, 
or differences between documents and authors,” 
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Peters says. PaleoDeepDive also managed to 
extract 192,000 opinions on the classification 
of taxonomic names from the papers, whereas 
the PBDB’s human curators found only 80,000. 

PaleoDeepDive did not do such a bad job at 
organizing that information either. Ina Decem- 
ber 2014 paper, Ré and Peters report that from a 
random sample of 100 statements drawn from 
the computer-generated database, 92% were 
correct — which they 


say was similar to “It’s alittle 
the accuracy of the SCary, the 
PBDB (S. E. Peters machines are 
et al. PLoS ONE9, getting that 
e113523;2014).The good.” 


two databases also 

scored similarly in a second experiment, when 
scientists were presented with five documents 
and asked to score the accuracy of facts that had 
been mined from them by the PBDB and by 
the computer. 

And perhaps most impressively, PaleoDeep- 
Dive was used to estimate species diversity and 
extinction rates over the past 500 million years, 
coming up with measures similar to those 
determined by the PBDB. 

“Tt’s a little scary, the machines are getting that 
good. That's just something that we're going to 
have to get used to,’ says Mark Uhen, a palaeon- 
tologist at George Mason University in Fairfax, 
Virginia, who is on the PBDB’s executive coun- 
cil. “I think it’s one of the best innovations that 
palaeontology has had ina very long time,’ says 
Jonathan Tennant, a palaeontologist at Impe- 
rial College London. He uses the PBDB every 
day and thinks that text mining could serve as 
a useful way to collect a large amount of data 
for later manual inspection — but not asa full- 
on replacement for human-curated databases 
such as the PBDB. “I don't see machines replac- 
ing humans. I think it’s important that we retain 
the human aspect of the analytics,” he says. 

John Alroy, a palaeontologist at Mac- 
quarie University in Sydney, Australia, who 
co-founded the PBDB but is no longer affili- 
ated with it, is less bullish on text mining. He 
says that DeepDive tends to overestimate the 
period during which species existed, leading 
to mistaken estimates of species diversity. He 
sees speed as the only advantage of text min- 
ing. “But there is no need to be fast in this case 
because the PBDB is already extremely com- 
prehensive, so pretty much any question you 
might want to ask can already be answered 
with it. That explains why it has generated so 
many publications,” Alroy says. 


TEXT-MINING FRUSTRATIONS 

Peters says that he will be using the computer- 
generated database as a supplement to the 
human-generated PBDB but adds that, for now, 
the limited number of documents it works from 
make it of little added use to palaeontologists. 
He wanted to let PaleoDeepDive loose on a 
bigger set of documents, but he did not have 
legal permission. As other text miners have 
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discovered, many publishers of paywalled arti- 
cles are cautious about allowing researchers to 
text mine their papers, even if they have lawful 
access to the literature; publishers tend to place 
limits on how the results of text mining can be 
published and reused, and often limit the num- 
ber of papers a scientist can download at any 
one time (see Nature 483, 134-135; 2012). “I 
cant think of any single palaeontologist who has 
40,000 papers in their own stash, at least legally 
acquired,” says Tennant. 

Peters and Livny spent months brokering 
a deal with one scientific publisher, Elsevier, 
to gain access to thousands of papers. “This is 
just the frustrating reality of things right now: 
advanced capabilities in machine reading and 
learning are coming out, and the bottleneck in 
progress is now getting documents together in 
one place for analysis,’ Peters says. He and his 
colleagues are working on amassing and pars- 
ing documents to feed into PaleoDeepDive 
and a related software tool for the geosciences 
literature called GeoDeepDive. Ré, meanwhile, 
is working with experts in other fields to apply 
DeepDive to drug development, genomics and 
human trafficking. 

Many palaeontologists also want to make it 
easier to find the data buried in their papers, 
so they are calling for research papers to be 
described more systematically in the future. 
“If we start having publication where every- 
thing is standard, then it will be much easier 
to read and process that data,” says Tennant. 
Uhen adds, “I think there’s a sort of cultural 
shift going on in palaeontology, where people 
are interested in data aggregation, and getting 
more insistent about being crystal clear about 
where youre finding your fossils.” 

Despite these challenges, many palaeontolo- 
gists see text mining as the way forward for 
their field. “It’s a huge waste of time for grad 
students and postdocs to manually re-enter 
already published information into a struc- 
tured database,” says Ross Mounce, a palae- 
ontologist at the Natural History Museum in 
London who is using text mining to track how 
the museums 80-million-specimen collection 
is used in research papers. Peters hopes that 
efforts such as PaleoDeepDive will allow him 
and his colleagues more time to generate data 
instead of spending their days organizing data 
they already have. “I see these machine reading 
systems as liberating our efforts a little bit, and 
shifting our work back into the field and back 
into the museums.” m 


Ewen Callaway writes for Nature from 
London. 


CORRECTION 

The Toolbox article ‘How to catch a cloud’ 
(Nature 522, 115-116; 2015) gave the wrong 
location for the Texas Advanced Computing 
Center — it is in Austin not San Antonio. 


CLAIRE WELSH/NATURE 


CAREERS 


TRADE TALK How a science educator moved 
from research lab to teaching lab p.119 


SCIENCE COMMUNICATION Opportunities for 
public engagement go.nature.com/zex87n 


NATUREJOBS For the latest career 
listings and advice www.naturejohs.com 


LEISURE ACTIVITIES 


The power of a pastime 


From painting to punching to aeroplane-jumping, the hobbies that scientists pursue offer a 
vital escape from the laborious life of the lab. 


BY CHRIS WOOLSTON 


Ibert Einstein mastered the violin. 
Ar Feynman banged bongos. Fol- 
owing in the tradition of multi-talented 
physicists, Federica Bianco likes to take a break 
from her research to punch people in the face. 
Bianco, an avid boxer who is also an astrophysi- 
cist at New York University, flew to Richmond, 
California, for her first professional bout in 
April. It did not go well for her opponent. Bianco 
pinned her competitor to the ropes witha flurry 
of punches and did not let up until the referee 
called the fight. It took just one minute and 
twenty seconds. “I didn’t want to stop, but she 
was taking too much punishment,’ Bianco says. 


For Bianco, boxing is not just a hobby; it is a 
total mind-and-body escape from her work. “As 
ascientist, I'm thinking about all sorts of things 
all the time,” she says. “The ring is quiet. You get 
tunnel vision. The other person is trying to take 
off your head and you have to deal with that.” 

At a time when competition for science 
funding and job promotions sometimes 
resembles a boxing match, many research- 
ers have trouble conceiving of an active life 
outside the lab. Indeed, there can be subtle — 
or not so subtle — pressures to sacrifice leisure 
time and put aside other interests for the sake 
of the next experiment, paper or conference 
talk. But many scientists say that their pastimes 
make them better researchers by sharpening 
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their minds, building confidence and reducing 
stress. Their experiences should offer hope to 
researchers who are feeling overwhelmed by 
the pressure of their jobs. Release can be just a 
ride, jump, joke or punch away. 

To be sure, some senior researchers in 
academia and other sectors still look askance at 
hobbies or leisure activities. Ryan Raver, now 
a product manager at the biotechnology firm 
Sigma-Aldrich in St Louis, Missouri, recalls 
an instance at the University of Wisconsin- 
Madison when one of his thesis-committee 
members was reluctant to sign off on his 
PhD because he thought that Raver spent too 
much time blogging and playing lead guitar 
in a hard-rock band. “He thought I should > 
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> have been more focused on my work, Raver 
says. “But playing in the band helped me sur- 
vive grad school. It kept my excitement and 
motivation up. It pushed me through the day.” 

Sean Carroll, a physicist at the California 
Institute of Technology in Pasadena, wrote a 
blogpost advising scientists to choose their 
hobbies carefully, especially if they ever want 
to win tenure. Specifically, he counsels them 
to stay away from pastimes that could drain 
attention from science. “You are better off if 
your hobbies are nothing like your work,” he 
writes. “Permissible hobbies include skydiving, 
playing guitar, or cooking. Suspicious hobbies 
include writing of any sort (novels, magazine 
articles, blogs), programming or web stuff, 
starting a business, etc. Why? Because there’s 
a feeling that this kind of activity represents 
time that could be spent on research” 

Carroll, who in 2005 was denied tenure at a 
different institution, says in his blogpost that 
he regrets the time and effort that he put into 
writing Spacetime and Geometry: An Introduc- 
tion to General Relativity (Addison-Wesley, 
2003), a textbook that did not win him many 
points in scientific circles. In the blogpost, he 
deems the book “probably the worst thing I 
did personally”. 

Raver says that he had a much easier time 
following his outside passions once he left 
academia for an industry job. “Professors and 
academics want you to believe that the more 


hours you put in, the more likely it is that you'll 
have quality data,” he says. “But people aren't 
machines. They need to take breaks and reset 
their minds when things get tough” 


FIGHT SCIENCE WITH SCIENCE 

There is plenty of evidence that scientific 
research and leisure pursuits can coex- 
ist. A study published in 2008 found that 
Nobel prizewinners were more likely than 
other scientists or members of the public to 
have long-standing hobbies. Notably, the 
prizewinners were about 1.5 times more 
likely to actively pursue arts and crafts 
than were members of the US National 
Academy of Sciences(R. Root-Bernstein et al. 
J. Psychol. Sci. Technol. 1, 51-63; 2008). For 
this sample, hobbies turned out to be better 
predictors of Nobel-level greatness even 
than reported IQ, which does not vary much 
between ‘top’ and ‘average’ scientists. 

Robert Root-Bernstein, a physiologist at 
Michigan State University in East Lansing and 
lead author of the study, says that it is hard to 
know whether hobbies themselves help to fuel 
genius or whether geniuses are simply more 
likely to take up hobbies (see “The secret lives 
of polymaths’). “It’s probably some combina- 
tion,” he says. He also notes that, contrary to 
public opinion, scientific masterminds tend 
to be more adventurous, daring and physically 
vigorous than are members of the general 


NOBLE PASTIMES 


The secret lives of polymaths 


The ranks of Nobel prizewinners in the 
sciences include many artists, musicians, 
athletes and writers. Here are the hobbies of 
seven notable Nobel laureates — and one 
single-minded exception. 


Frederick Banting, who shared the 1923 
Nobel Prize in Physiology or Medicine 

for his co-discovery of insulin, left behind 
hundreds of paintings and sketches when 
he was killed in a plane crash at the age of 
49. One of his oil paintings sold for more 
than US$76,000 in 2008. 


In addition to playing the bongos, Richard 
Feynman, who shared the Nobel Prize in 
Physics in 1965 for his work on quantum 
electrodynamics, sketched and painted 
under the pseudonym Ofey. Female nudes 
were a common subject. 


Albert Einstein could play Beethoven 
sonatas on his violin when he was a 
teenager and he performed at many benefit 
concerts in his later years. He once said 
that “life without playing music is 
inconceivable to me”. 
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Carol Greider, who shared the 2009 Nobel 
Prize in Physiology or Medicine for her 
groundbreaking work with chromosome 
telomeres, is a dedicated athlete who has 
competed in several triathlons. 


May-Britt and Edvard Moser, a married 
couple who shared the 2014 Nobel 

Prize in Physiology or Medicine for their 
discovery of the neural grid cells that help 
humans and other animals to navigate their 
surroundings, are avid mountain climbers 
who got engaged on the summit of Mount 
Kilimanjaro in Tanzania. 


Stefan Hell, co-recipient of the 2014 

Nobel Prize in Chemistry for his work with 
fluorescent microscopy, is a saxophone 
enthusiast who specializes in jazz and improv. 


In a 2007 interview, geneticist Elizabeth 
Blackburn told the journal Biotechniques that 
she did not really have any major pastimes 
outside of the lab. “People who love their 
work don’t need hobbies,” she said. “Work is 
your hobby.” She went on to share the 2009 
Nobel Prize in Physiology or Medicine. €.W. 
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public. “An unexpectedly large number of 
Nobel laureates took up surfing when it came 
into fashion in the 1960s,” he says. 

Some in academia do appreciate the value 
of hobbies and leisure pursuits for early- 
career researchers, perhaps because they have 
discovered it for themselves. Tony Ryan, a 
chemist and pro-vice-chancellor of the fac- 

ulty of science at 


“Permissible the University of 
hobbies include Sheffield, UK, has 
skydiving, hired many scien- 
playing guitar, tists over the years. 
or cooking. He has always been 
Suspicious reluctant to offer a 
hobbies include ee a be 
writing o who is so focuse 

any bak f on research that she 


or he has no time 
for anything else. 
“We want to know that you're an A-1 excellent 
scientist, but we also want to know that you're 
a well-rounded person whom students will 
relate to,’ he explains. 

He has his own obsession: despite the 
demands of his job, he logs about 8,000 kilo- 
metres on his bicycle every year. He bikes 
to and from work, and rides with a group of 
enthusiasts — who call themselves ‘Common 
Lane Occasionals’ — every Saturday morning. 
While pedalling, he likes to talk about prime 
numbers with a computer scientist, some- 
times to the annoyance of other members of 
the group, which includes a tree surgeon, a 
plumber and a physician. 

Ryan even brings his bike along to inter- 
national conferences, whether in London 
or Hong Kong. “I can fold it up and pack it 
into a suitcase,’ he says. “Other people will 
bring bikes of their own. Cycling has become 
the new golf” 

Similarly, Alexander Suh, an evolutionary 
biologist at Uppsala University in Sweden, 
packs his climbing gear for any conference 
that might be near a rock face or climbing 
wall (pretty much all of them). Suh, who has 
been engaging seriously in the sport for about 
three years, says that his hobby has been great 
for professional networking. “So many biolo- 
gists are interested in rock climbing that you 
can meet up with someone anywhere you 
go,’ he says. “We talk about science and we 
talk about climbing.” When he is on his home 
campus, Suh tries to squeeze in an hour or so 
of climbing “whenever work gets crazy”. In 
addition to clearing his mind, scaling a wall 
helps to undo the physical toll of genomics 
research. “I get a caveman posture sitting in 
front of a computer all day,” he says. 

Many scientists enjoy climbing, but 
Maria Sapar prefers falling. A PhD student 
in molecular biology at Cornell University 
in Ithaca, New York, Sapar has made 147 
skydiving jumps, with many more ahead — 
not bad for someone who is a bit afraid of 
heights. “I’m nota big fan of roller coasters,” 


CHRIS ATKINS. 


Astrophysicist Federica Bianco duels her opponent in a boxing match in Richmond, California. 


she says. Fortunately, the vast distance to 
the ground does not necessarily register 
when she is leaning out of an aeroplane. 
“Tt’s like looking at Google Maps,’ she says. 
Still, there is just enough risk and excite- 
ment to put the rest of her life — especially 
her research — in perspective. “When I’m 
scared or nervous about something in 
science, I think: Maria, you jump out of 
aeroplanes. You can do this.” 

Although skydiving falls into Carroll’s 
category of ‘permissible hobbies, Sapar does 
not talk much about her pastime in the lab, 
and has decided against putting it on her 
CV. She reckons that future employers will 
care more about her research and publica- 
tion history than her jump count, and there 
is always the chance that someone might 
take a negative view of her hobby. “When I 
tell people about it face to face, I always get 
one of two reactions. It’s either, ‘Oh, that’s 
cool, or “Why would you ever do that?” 

Adam Ruben, a malaria-vaccine 
researcher with the biotechnology com- 
pany Sanaria in Rockville, Maryland, has 
had some scary moments of his own while 
practising his hobby: stand-up comedy. 
As a graduate student at Johns Hopkins 
University in Baltimore, Maryland, he 
would head to the city’s clubs and bars to 
try out some jokes in front of often less- 
than-impressed crowds. “I'd go to open- 
mic nights where there were 30 other 
comedians and 5 audience members. It was 
terrible,” he says. After moving to nearby 
Washington DC, he started to perform for 
bigger audiences that were receptive of the 
occasional foray into science humour. 

Ruben still takes time away from his 
work to develop his act and perform live 
shows. In addition to one-liners, he often 
tells stories about his time as a PhD student, 
a topic that he mined heavily for his book, 


Surviving Your Stupid, Stupid Decision 
to Go to Grad School (Three Rivers Press, 
2010). For example, he talks about the 
time that he worked three straight 21-hour 
days to provide data for an adviser’s 
presentation. The twist — as many sci- 
entists in his audiences might guess — is 
that the data never got used. He says that 
he is generally happy with his educa- 
tion and scientific career, but he is also 
grateful that he has a platform through 
which to joke about its flaws. “Academia 
could use more humour,’ he says, even if 
some of that humour has a sharp point. 
“Only by complaining about something 
can you actually do something about it” 

Like Ruben, Bianco is actively looking 
for gigs. She has yet to schedule her next 
bout, but is still devoting many hours to 
the ring. She spars with a partner several 
times a week, and she is always trying 
to get better. “Getting a PhD in physics 
made me a competitive person,’ she says. 
At first, she was worried that her fel- 
low researchers might look down on her 
hobby. But the word is out about how she 
spends her time outside the lab, and she 
has been pleasantly surprised by the posi- 
tive responses from both the boxing and 
physics communities. Boxers whom she 
meets are always amazed to learn that she 
is an astrophysicist, and physicists have 
been completely supportive of her pugilism. 
“Everyone is amused, interested and some- 
how, even proud,’ she says. 

Depending on the setting, she is either a 
physicist-boxer or a boxer-physicist. Either 
way, she is proof that scientists can be more 
than their work, especially if they happen 
to have a wicked right hook. m 


Chris Woolston is a freelance writer in 
Billings, Montana. 


© 2015 Macmillan Publishers Limited. All rights reserve 


TRADE TALK 
Science educator 


Elizabeth Waters 
manages the outreach 
teaching laboratories 
at the Rockefeller 
University in New 
York City, where 
high-school students 
and their teachers 
can use state-of-the- 
art equipment. She 
explains how boosting others’ enthusiasm for 
and understanding of science builds on what 
she liked most about scientific research. 


What skills from the lab help you to do 

your job? 

As a researcher, I was fortunate enough to 
receive my own grant and manage the grant 
budget myself. I learned how to keep tasks and 
costs in line with the goals of the project. And 
learning how to establish collaborations with 
other researchers was very relevant to what I 
do now. Making sure that I understand other 
people’s expectations of my role and their 
expectations of their role — that is really criti- 
cal. All of the details around running a class 
smoothly depend on those skills. 


When did you first work with students? 

The lab that I worked in as a researcher at 
Rockefeller often hosted a high-school or col- 
lege student. I asked for students to mentor and 
realized that I was starting to think more big- 
picture about the students’ research experience, 
working out what kinds of projects would be 
good for them. I did that because I loved to see 
other people have the opportunity to talk about 
science and get excited. Now, we're bringing 
students into labs that are just like those in 
which Nobel prizes have been won. The itera- 
tions of moving from training one student to the 
next gave me skills that I use all the time. 


How did you make it into a career? 

I started talking to lots of people when I was a 
postdoc, asking, “What is your favourite part 
of your job?’ I tried to identify the theme that 
really resonated most with me. Science educa- 
tion was at the top of my list. Then I tailored 
my CV. I volunteered to organize a yearly 
outreach event for fifth-graders, and taught a 
medical laboratory class at Hunter College in 
New York City, where I was an adjunct profes- 
sor. Making time for those activities and for 
networking was not neglecting my research 
duties. It was serving my scientific career. m 


INTERVIEW BY MONYA BAKER 
This interview has been edited for length and clarity; 
see go.nature.com/gpmhxr for more. 
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Ua SCIENCE FICTION 


BROKEN MAPS OF THE SEA 


BY PRESTON GRASSMANN 


he giants rise out of the sea, 
"TTissberin forms half-con- 

cealed beneath the nets 
of ancient hunters. They carry 
their histories behind them, 
the broken hulls of sunken 
ships, the red-orange rust 
of harpoons, the twisted 
remains of old guns, 
pulled over sand and 
stone. Mist pours in 
behind them, ushering 
them across the village, 
as if some part of that 
world wants to follow 
them out. 

As always, children 
stand behind the legs 
of parents, pointing at 
the giants’ strange bod- 
ies, their slow, careful steps 
onto land. Others stand far- 
ther away and offer prayers, 
remembering the weight of that 
history, long before the days the 
giants grew legs and walked on land. 

Nowa song carries across the village. 

“Do you know why those songs are called 
‘broken maps?” he asks. 

She hears them weaving in and out of 
single tones, diverging into high and low 
notes, as if tuning to each other. 

“Because they use sound to guide them- 
selves,” she says. Yuko closes her eyes, trying 
to imagine the undersea world. “Like a map.” 

“They weren't just about maps; they were 
histories. But they were drowned by our 
machines. All our seismic surveys, the sonar 
systems of passing ships, blaring through the 
sea every day were like screams, breaking 
their patterns apart. 

They approach the sand path, where a 
long winding road leads upward through the 
town, to a place Yuko has seen only from a 
distance. 

“Millions of years before we came along, 
the oceans were filled with their histories.” 

She can see the parts of the old hunters’ 
ships and weapons carefully assembled as 
they approach the temple, the remains that 
form the framework of its foundations. 

“And then we came along and began to hunt 
them,’ Yuko says. “Why do they come back?” 

“They come out of the sea to sing again, to 
remember their songs. And offer us a part of 
their memory.” 
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History lesson. 


As Yuko walks with her father, she hears 
a complex aria, a song that slowly surfaces 
in her mind. Other sounds play out beneath 
them, rhythms that are older than human- 
kind. Soon, she will be among the memory- 
bearers. 

When they reach the top of the hill, the 
giants stop, their voices coursing through 
a familiar pattern. She had heard it many 
times in preparation for this day. But there 
are new patterns now, unfamiliar histories 
playing out between the notes. She tries to 
remember the lessons of her father, of every- 
thing he taught her about how to carry the 
songs in her mind. 

The giants pull themselves out of the 
nets, carrying parts of the temple out, plac- 
ing them against the wall of the structure for 
assembly. Their temple is finally beginning 
to take shape, a construction in the form 
of their ancient bodies, with a single long 

tail, and a vast head 


> NATURE.COM turned up to the sky. 
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ritual path until she stands just below them. 
Their heads are concealed in mist far 
above her tiny form, but she feels their 
thoughts already turning towards 
her. The world is a shifting ter- 
rain now, as if she is both here, 
on this Taiji hill, and some- 
where far below the waves. 
She can feel her father 
standing next to her, 
helping her stand. “Stay 
with it,” he says, but his 
voice is far away. She 
is somewhere beyond 
the shore, descending 
in long spirals. She 
feels herself choking, 
struggling to breathe, 
the music opening into 
patterns of motion and 
light. And then she is 
breathing from the bor- 
rowed lungs of a giant. 
The farther down she 
goes, the more she falls inside 
the music, into a space far from 
the seaside village of her birth. She 
turns slowly, held by webs of sound 
that unspool like silk from their voices, 
weaving patterns that become maps and 
histories, and stories that she has never 
heard before. 

Her father is kneeling next to her. There 
is a light at the edge of his eyes, crystalline, 
as if reflecting the patterns of the sea, a coral 
reef at the edge of a cave. She can still feel the 
giant, a ghost version of its body continuing 
beyond her own. 

“You are now a memory-bearer,” her 
father says, holding her up. Others gather 
around them, the giants turning their mas- 
sive bodies to face them. She stares out at 
the fields beyond the temple and watches 
them change into a garden of coral. She feels 
the ghosts of other forms floating through 
depths she has never seen until now. A pro- 
cession forms, opening into a line that leads 
to the temple and she walks towards it, pre- 
paring to tell them a story they have never 
heard before. m 


Preston Grassmann became a freelance 
writer after working as a regular reviewer 
for Locus Magazine. His most recent 

work has been published in AE: Canadian 
Science Fiction, Daily Science Fiction, 
Mythic Delirium and Slave Stories: Scenes 
From the Slave State. 
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